Difference between revisions of "X264 TODO"

From VideoLAN Wiki
Jump to navigation Jump to search
(first tasks)
 
(more shit)
Line 21: Line 21:
 
* On extremely fast encoding settings, fast skip is actually kind of slow.  But anything dumber (e.g. SAD) is completely useless.  Is there some better balance that can be achieved here?
 
* On extremely fast encoding settings, fast skip is actually kind of slow.  But anything dumber (e.g. SAD) is completely useless.  Is there some better balance that can be achieved here?
 
* Chroma-aware mode decision for B-frames?
 
* Chroma-aware mode decision for B-frames?
 +
* See the TODOs for deblock-aware RD in common/deblock.c.
  
 
===Psy===
 
===Psy===
Line 30: Line 31:
 
* Lambda should be varied on a per-DCT-block basis instead of a per-macroblock basis.
 
* Lambda should be varied on a per-DCT-block basis instead of a per-macroblock basis.
 
* Lambda should be picked independent of quantizer (i.e. with greater precision).
 
* Lambda should be picked independent of quantizer (i.e. with greater precision).
 +
* Classic problem: a block is mostly high complexity but has a small area of low complexity.  How do we judge whether that area is important?  Good example: sharp text on background with film grain; grain gets blurred out because of the text.
 +
** If we think it's important all the time, we ruin the quality of many clips that rely on raising complexity on edges (Touhou).
  
 
===Lookahead===
 
===Lookahead===
 
* Lookahead should be multithreaded, either by splitting the frame (sliced threads) or running multiple frame analysis calls at once.
 
* Lookahead should be multithreaded, either by splitting the frame (sliced threads) or running multiple frame analysis calls at once.
 
* Temporal MV predictors in lookahead?  There's a patch for these somewhere, but they biased heavily in favor of B-frames, likely by improving the motion search.
 
* Temporal MV predictors in lookahead?  There's a patch for these somewhere, but they biased heavily in favor of B-frames, likely by improving the motion search.
 +
 +
===Quantization===
 +
* CAVLC "trellis" is a hack.  It works, but it's a hack.  Make it better.  See the TODOs in encoder/rdo.c.
 +
* There's room for something between trellis and deadzone in terms of complexity.  libvpx has a good example -- it biases towards zero-runs in its "medium speed" quantizer.  This can't be SIMD'd easily, but is still vastly faster than trellis.  A nonlinear quantizer (be more likely to round up larger coefficients) might also be useful.
 +
* Floyd-Steinberg for quantization?  Try pushing quantization error to nearby DCT coefficients.  Should this go from high to low or low to high?
 +
* Energy-preserving quantizer -- maintain L1 (or maybe L2?  I'm not sure) energy.  Should we maintain it in the spatial domain (post-iDCT) or residual domain?  Probably the former.
 +
 +
===Interlacing===
 +
* Finish adaptive MBAFF.  Talk to horlicks about this one.
 +
* Make slice-max-mbs/slice-max-size work with interlacing.  Pretty much requires a good portion of adaptive MBAFF.
 +
 +
===Weighted Prediction===
 +
* Make weightp work with interlacing.  Preferably abuse reference duplication to make it useful for MBAFF.
 +
* Make weightp work with chroma.  Talk to DylanZA about getting his current patch for this one.
 +
* Finish K-means decision for weightp.  Talk to DylanZA about getting his current patch for this one.
 +
* Add explicit weighting for B-frames, too.  This helps in nonlinear fades, among other cases.

Revision as of 21:43, 13 September 2010

This page contains an incomplete list of things available in x264 for you to do. It's organized into sections covering various parts of x264.

Some useful resources: Dark Shikari's pile of junk, Pengvado's pile of junk.

Motion Estimation

  • Sequential elimination (SEA), used for exhaustive search, might be more generally applicable to algorithms like UMH, by letting us skip a lot of SADs. The downside is we won't be able to use SAD_X4 anymore.
  • (T)ESA is currently wrong for motion searches done on weightp duplicates. This effect is miniscule, but it still should be fixed.
  • Hierarchical motion estimation might be a useful way to catch very long motion vectors without the cost of UMH or ESA. It might also help regularize motion.
  • Somehow take into account the effect of motion vector decision on future blocks.
    • Hierarchical motion estimation
    • Approximations from lookahead MVs
    • Iterative ME (as per Snow)
    • Trellis motion estimation

Intra Analysis

  • Make the early terminations smarter. Currently they're just hacks -- some statistical analysis might be useful.
  • SAD (subme 1) i8x8 vs i4x4 decision is a bit bad. Can it be improved without significant speed loss?

Mode Decision

  • Can we find more ways to skip more motion searches in multiref?
  • On extremely fast encoding settings, fast skip is actually kind of slow. But anything dumber (e.g. SAD) is completely useless. Is there some better balance that can be achieved here?
  • Chroma-aware mode decision for B-frames?
  • See the TODOs for deblock-aware RD in common/deblock.c.

Psy

  • Psy-RD is a hack. It works, but it's a hack. If you apply QNS with Psy-RD as the metric, it goes way overboard and gives terrible results. This means that Psy-RD only works because normal mode decision is limited in the way it can modify the image to better suit the metric. Is there a way to make it better?
  • Should RD be linear at all? Perhaps we should weight more heavily against low quality blocks and also try to ignore minuscule distortion that viewers can't see.
  • Psy-trellis (and maybe psy-RD?) are too strong at very high QPs.
  • Psy-trellis should be merged with Psy-RD. There are patches for this, but they probably won't be committed until psy-RD itself is fixed.
  • RD should take into account local variance.
  • Lambda should be varied on a per-DCT-block basis instead of a per-macroblock basis.
  • Lambda should be picked independent of quantizer (i.e. with greater precision).
  • Classic problem: a block is mostly high complexity but has a small area of low complexity. How do we judge whether that area is important? Good example: sharp text on background with film grain; grain gets blurred out because of the text.
    • If we think it's important all the time, we ruin the quality of many clips that rely on raising complexity on edges (Touhou).

Lookahead

  • Lookahead should be multithreaded, either by splitting the frame (sliced threads) or running multiple frame analysis calls at once.
  • Temporal MV predictors in lookahead? There's a patch for these somewhere, but they biased heavily in favor of B-frames, likely by improving the motion search.

Quantization

  • CAVLC "trellis" is a hack. It works, but it's a hack. Make it better. See the TODOs in encoder/rdo.c.
  • There's room for something between trellis and deadzone in terms of complexity. libvpx has a good example -- it biases towards zero-runs in its "medium speed" quantizer. This can't be SIMD'd easily, but is still vastly faster than trellis. A nonlinear quantizer (be more likely to round up larger coefficients) might also be useful.
  • Floyd-Steinberg for quantization? Try pushing quantization error to nearby DCT coefficients. Should this go from high to low or low to high?
  • Energy-preserving quantizer -- maintain L1 (or maybe L2? I'm not sure) energy. Should we maintain it in the spatial domain (post-iDCT) or residual domain? Probably the former.

Interlacing

  • Finish adaptive MBAFF. Talk to horlicks about this one.
  • Make slice-max-mbs/slice-max-size work with interlacing. Pretty much requires a good portion of adaptive MBAFF.

Weighted Prediction

  • Make weightp work with interlacing. Preferably abuse reference duplication to make it useful for MBAFF.
  • Make weightp work with chroma. Talk to DylanZA about getting his current patch for this one.
  • Finish K-means decision for weightp. Talk to DylanZA about getting his current patch for this one.
  • Add explicit weighting for B-frames, too. This helps in nonlinear fades, among other cases.