Difference between revisions of "X264 Development Newsletters"
(Created page with "This is an archive of the x264 Development Newsletters. == Volume 1 == ''9/18/2010'' From now on, with each weekly-ish push of new patches, I'll try to post a newsletter as…") |
|||
Line 127: | Line 127: | ||
SATD/SA8D/Hadamard_ac have been improved to eliminate overflow at high bit depths (this would cause wrong analysis results). | SATD/SA8D/Hadamard_ac have been improved to eliminate overflow at high bit depths (this would cause wrong analysis results). | ||
− | Three ratecontrol bugs have been fixed: | + | Three ratecontrol bugs have been fixed: |
+ | |||
+ | #B-frame size prediction has been fixed (it used quantizer instead of linearized qscale). This should improve VBV. | ||
+ | #Overflow compensation, originally designed for 1-pass ABR, has been disabled with CRF on, as it doesn't do anything useful with pure CRF and causes very bizarre results with CRF+VBV in areas of extremely high complexity. | ||
+ | #A bug in the way that clip_qscale caps the amount by which a frame can be higher quality than a previous frame (as a result of attempting to use up extra bits that would otherwise be wasted) has been fixed. This caused CBR to take extremely long times (e.g. seconds) to recover from extreme changes in complexity at low bitrates. | ||
A possible linking problem with lavf has been fixed. | A possible linking problem with lavf has been fixed. | ||
− | Some MV/ref prefetching code was in the wrong place. Fixing this may result in a tiny performance improvement. | + | Some MV/ref prefetching code was in the wrong place. Fixing this may result in a tiny performance improvement. |
===== Improvements: ===== | ===== Improvements: ===== |
Revision as of 07:22, 26 January 2011
This is an archive of the x264 Development Newsletters.
Contents
Volume 1
9/18/2010
From now on, with each weekly-ish push of new patches, I'll try to post a newsletter as well to give an update of what's going on and the changes.
Fixes:
Different versions of 64-bit MinGW gcc were inconsistent with regards to whether or not to add a _ prefix to called functions. This broke compilation because the assembly code wouldn't link in correctly.
Intra refresh got a lot of fixes thanks to bug reports by Chris Brien <chris.brien@tandberg.com>. Because x264 uses column refresh instead of row refresh, it has to be more careful with regards to which pixels it can predict from. This should fix some cases where intra refresh didn't refresh the whole frame correctly. Furthermore, it turns out that the recovery_frame_cnt syntax element, signaling the length of the intra refresh, is limited by the max frame number. Accordingly, x264 will raise the max frame number to ensure that recovery_frame_cnt is in spec.
Improvements:
I've gone through the file headers and updated them to be more accurate and consistent in terms of dates and authorship. It now also mentions the commercial license, on request from a licensee.
The new --disable-gpl configure option is now added for companies using the commercial license. x264 --version will print both the license information of itself and of the linked libavformat (for decoding). It'll also warn you if the combination you've put together isn't distributable, such as a nonfree ffmpeg + GPL x264, or a GPL ffmpeg + non-GPL x264.
x264 now passes the "full chroma input" flag to swscale. This should fix this quality issue when using point scaling: http://doom10.org/index.php?topic=585.0 .
New features:
Kieran Kunhya <kieran@kunhya.com> has finally committed his arbitrary SEI patch, which lets calling apps give x264 any SEI they want to include in the stream. This is useful for closed-captioning, frame packing, and all kinds of other SEIs that x264 doesn't explicitly support but which the calling application can create itself if necessary. The advantage of giving these to x264 instead of just inserting them separately is that it allows the resulting stream to remain HRD-compliant, since x264 knows they exist.
Upcoming:
High bit-depth input and filtering! x264 supports 10-bit encoding, but currently only takes 8-bit input. This change will finally create a stable API for high bit-depth input and allow users to filter in higher precision. It will also add dithering support to x264CLI for downconversion of high-bit-depth material.
HQDN3D (denoiser) and YADIF (deinterlacer) will be making their way into the x264CLI filter framework.
Chroma ME in B-frames will be coming soon, giving an extra bit of compression to those using slower presets.
Adaptive MBAFF development is coming along, with B-frames being finished up currently.
After high bit-depth input is complete, a large amount of assembly code will be committed as well, improving high bit-depth encoding performance greatly.
Ubuntu Maverick will include an updated x264, r1653.
Volume 2
9/27/2010
This is the second x264 development newsletter. If you missed the first one, this is a regular email containing updates on fixes and improvements in the most recent x264 push, along with updates on what's coming next.
Fixes:
libx264 should now give correct DTS and print the correct bitrate if the input PTS do not start at zero. This doesn't affect x264CLI, but does affect calling apps that don't start their PTS at zero.
libx264 ratecontrol should now work correctly in CFR mode with timebase != 1/fps. As documented, x264 was supposed to ignore the timebase if CFR mode is set (for purposes of ratecontrol), but that's not what actually happened; things just broke. This should now be fixed. This is necessary for correct behavior with tune=zerolatency in gstreamer.
A missing emms has been added to dump-yuv, fixing some Heisenbug brokenness when dump-yuv was used.
Improvements:
Slice-max-size has been made more aggressive. Previously we just made some blind assumptions about the distribution of escape bytes and other such inaccuracies -- now it should work absolutely. It accounts for a few extra bytes that are highly predictable (CABAC's extra byte, CAVLC's skip runs). The primary problem, however, was caused by the terrible design of H.264's CABAC, which outputs a string of zeroes if CABAC is perfectly adapted, thus forcing tons of escape bytes for any perfectly uniform image. Slice-max-size should be fixed in this case.
Chroma mode decision and subpel, mentioned in the last newsletter, is finally in. It's enabled at subme >= 9 and provides a ~0.4-1% compression improvement.
New features:
High-bit-depth support has been finished in libx264 and x264cli. The assembly code is still yet to be committed, so it's a bit slow, but libx264 now supports high-bit-depth input. Additionally, the filter framework now supports 16-bit, which will be useful for future filters to avoid rounding error and banding. It'll automatically perform dither to go down from 16-bit to 8, 9, or 10-bit, thus making this useful for 8-bit encoding as well.
High 10 Intra profile is supported now, which makes x264 now capable of encoding AVC Intra 50. See the commit message for a sample commandline.
Upcoming:
High bit depth assembly code is coming soon, which should give a nice (3x+) improvement to high bit depth encoding speed.
HQDN3D (denoiser) and YADIF (deinterlacer) will be making their way into the x264CLI filter framework. This was previously being blocked by the high-bit-depth modifications to the filter framework.
Adaptive MBAFF development is coming along, with B-frames being finished up currently.
Volume 3
10/10/2010
This is the third x264 development newsletter. If you missed the first two, this is a regular email containing updates on fixes and improvements in the most recent x264 push, along with updates on what's coming next.
Fixes:
The sigint handler in x264cli is now volatile, like it should have been. This likely didn't actually affect anything on any real system, but it's more correct.
-DNDEBUG broke the filtering system. This is now fixed.
The bugfix from last week regarding intra refresh predicting from topright blocks didn't work with 8x8dct. This is now fixed.
2-pass ratecontrol now works with CBR HRD. Previously, it ignored filler bits, and thus broke horribly. In general, you don't need 2-pass for CBR (and it probably doesn't help), but it should work better now.
A missing mod4-stack check was added; this should fix ICC builds on Phenom CPUs.
Improvements:
The build tree has been cleaned a bit. A bunch of old stuff, like the non-working regression test and doxy, have been removed.
Some of the information in doc/ has been updated to be less absurdly out of date, or at least inform the reader that it is absurdly out of date.
DTS compression has been moved out of the libx264 API and into the muxers, because it's a giant hack and it was making things ugly and messy (and starting to interfere with ratecontrol).
Various asm functions have been improved, particularly the qpel MC functions. Should help performance a bit on Core 2 and similar CPUs.
Upcoming:
High bit depth assembly code is coming soon, which should give a nice (3x+) improvement to high bit depth encoding speed.
HQDN3D (denoiser) and YADIF (deinterlacer) will be making their way into the x264CLI filter framework. This was previously being blocked by the high-bit-depth modifications to the filter framework.
Adaptive MBAFF development is coming along, with B-frames being finished up currently.
x262 is under development: a best-in-class MPEG-2 encoder built using the x264 framework.
Volume 4
11/10/2010
This is the fourth x264 development newsletter. If you missed the first three, this is a regular email containing updates on fixes and improvements in the most recent x264 push, along with updates on what's coming next. Previous versions can be found in the mailing list archives.
Fixes:
The Altivec SATD gave wrong results with very small strides (e.g. 8). This made output on PPC slightly different from x86, due to different results in lookahead and chroma ME. This is now fixed.
FPS reporting on Windows 64-bit was very imprecise (the timers were only accurate to 1 second). This was due to _ftime using the wrong struct due to the Windows ABI's complete retardation. This has been fixed by using ftime instead.
ssd_nv12 has been improved to eliminate overflow at high bit depths (this would, in theory, give wrong PSNR).
SATD/SA8D/Hadamard_ac have been improved to eliminate overflow at high bit depths (this would cause wrong analysis results).
Three ratecontrol bugs have been fixed:
- B-frame size prediction has been fixed (it used quantizer instead of linearized qscale). This should improve VBV.
- Overflow compensation, originally designed for 1-pass ABR, has been disabled with CRF on, as it doesn't do anything useful with pure CRF and causes very bizarre results with CRF+VBV in areas of extremely high complexity.
- A bug in the way that clip_qscale caps the amount by which a frame can be higher quality than a previous frame (as a result of attempting to use up extra bits that would otherwise be wasted) has been fixed. This caused CBR to take extremely long times (e.g. seconds) to recover from extreme changes in complexity at low bitrates.
A possible linking problem with lavf has been fixed.
Some MV/ref prefetching code was in the wrong place. Fixing this may result in a tiny performance improvement.
Improvements:
doc/threads.txt has been updated with modern benchmarks. This information may be useful to anyone looking to compare the impact of various threading strategies on quality and performance, as well as get an idea of how well x264 scales over multiple cores at various encoding speeds.
The presets are now addressable numerically (0=ultrafast...9=placebo).
The exact mappings may change in the future if new presets are added
or old presets removed, but they will always be in linear order of fast to slow.
The ffmpeg -vpre error message (if a user doesn't use a -vpre) has been made more descriptive, to better instruct users about how the ffmpeg -vpre system works.
weightp -1 offset dupes are now disabled in high bit depth mode, as they're a hack to get around crappy rounding in 8-bit encoding, and thus don't help in high bit depth.
PSNR and SSIM measurements are now VFR-aware! This means they will take into account the duration of frames. This may result in PSNR and SSIM results appearing dramatically different from other tools when used on variable framerate video. Don't fret; x264 is correct, they are not. Of course, results will stay the same for constant framerate video. This is the first step to VFR-aware MB-tree.
Quantizer handling has been improved. Library change: i_qpplus1 now defaults to X264_QP_AUTO. This doesn't actually change anything, as X264_QP_AUTO is 0, but this may change in the future. As a reminder, all x264_picture_t structs must be initialized using x264_picture_init or x264_picture_alloc! CRF values now make sense in high bit depth: --crf 23 means roughly the same quality in 8-bit or 10-bit, instead of being 4 times higher quality in 10-bit than in 8-bit. This means that --qp 0 should be used for lossless, not --crf 0. The latter will not result in lossless compression when in high bit depth mode. Bit depth has been added to statsfiles, to prevent users from inadvertently using a statsfile with the wrong bit depth.
Scenecut's flash detection has been improved to work more sanely in the case of flashes at the end of the videos.
Upcoming:
High bit depth assembly code is nearly ready. Speed boost has been measured as about 4.3x and is still improving.
VBV Emergency Mode is finally completed, with just fine-tuning and bugfixing left. This makes x264 able to deal gracefully with extreme input combined with VBV restrictions (e.g. noise, Doremi Labs test boxes). This is critical for some broadcast applications. Unlike most competing encoders, this VBV Emergency Mode does not drop frames or force all blocks to skip or some similarly extreme step: it adds a "denoising" step to reduce the complexity of the video, which scales upwards until it simply removes all content at the most extreme level.
A pad filter is nearly ready for the x264CLI filter framework.
HQDN3D (denoiser) and YADIF (deinterlacer) will be making their way into the x264CLI filter framework.
Adaptive MBAFF development is coming along, with B-frames being finished up currently.
x262 is under development: a best-in-class MPEG-2 encoder built using the x264 framework. Basic structure is done, with intra coding mostly finished.
Work is planned to integrate x264 with the Sandy Bridge's encoding ASIC for improved encoding performance. Current status is: waiting on Intel.
Volume 5
11/19/2010
This is the fifth x264 development newsletter. If you missed the first four, this is a regular email containing updates on fixes and improvements in the most recent x264 push, along with updates on what's coming next. Previous versions can be found in the mailing list archives.
Note that we pushed a bugfix release this time around, so this newsletter includes fixes from those commits as well, i.e. it covers everything since the last newsletter.
Fixes:
HRD now works correctly when used with intra refresh. Thanks to a certain x264 commercial licensee for the bug report.
QPfile parsing now works as it was supposed to have worked last time: users should be able to omit QP values.
Allocate the correct amount of memory for weightp buffers with weightp + high bit depth (it allocated too much, not too little).
Fix flash detection to work correctly near the end of the keyframe interval.
Fix a crash in dump yuv with some resolutions.
Various fixes have been made to ratecontrol in high-bit-depth mode.
Constrained intra pred is working properly again in all cases.
Improvements:
x264's SEI header now indicates whether the build used was GPL or proprietary.
x264 is now compatible with FFMS2's most recent API break.
configure now logs test programs that failed, not just the error output.
Merge Oskar's (irock's) 10-bit asm branch: ~4.4x overall speed boost in high bit depth mode.
Chroma weighted prediction: dramatically improved chroma compression and quality in fades.
Custom cropping rectangle support: users can now specify --crop-rect to add values to the H.264 cropping header. This is supposedly useful for 3D television applications (to allow legacy decoders to access only one view of the image).
Upcoming:
VBV Emergency Mode is finally completed, with just fine-tuning and bugfixing left. This makes x264 able to deal gracefully with extreme input combined with VBV restrictions (e.g. noise, Doremi Labs test boxes). This is important for some broadcast applications.
The pad filter and yadif are nearly ready for the x264CLI filter framework. Both now support high bit depth.
Adaptive MBAFF development is coming along, with B-frames being finished up currently.
x262 is under development: a best-in-class MPEG-2 encoder built using the x264 framework. Basic structure is done, with intra coding finished and inter coding begun.
Work is planned to integrate x264 with the Sandy Bridge's encoding ASIC for improved encoding performance. Current status is: waiting on Intel (these guys move at the speed of a three-toed sloth swimming down a river of bricks).
Other news:
Since we started a couple of months ago, over 40 companies have contacted us regarding x264 licensing!
At least one major Blu-ray authoring house is switching to x264 for their commercial Blu-rays.
At least one commercial encoding application based on x264 is currently in the works. An announcement will come Soon™.