Hacker Guide/Video Filters
This is the Hacker's Guide for video filters (video filter2, VLC 1.2).
Contents
Location in source tree
The right place for video filter is in the modules/video_filter/ folder.
See modules/video_filter/Modules.am for a list of video filters currently in VLC, and which source files belong to which.
See new module integration for additional help on how to integrate a new module in the build-system.
Individual filter docs
Some filters may have their own Hacker's Guide sections.
Currently, the deinterlace filter has both a Hacker's Guide section and user documentation with some useful technical information.
Programming conventions and tips
VLC video filters are usually written in C99, like the rest of VLC. The AtmoLight plugin uses C++, though.
The usual stuff holds here too:
- Use assert() for checking conditions that, in presence of no bugs, always hold.
- Don't use assert() for checking stuff that may fail at runtime. Handle errors gracefully.
- Keep your code readable. Make private functions when it simplifies things.
- Prefer 80 characters per line at maximum, if possible.
Particularly for video filters: keep in mind that this is low-level stuff and needs to be fast.
- How fast? Note that the source may be interlaced, and someone may have thrown a framerate doubler into the filter chain... this means that for a 60i signal, the entire processing chain from decode to render has about 16ms per output frame.
- Keep the required memory bandwidth down.
- Don't copy whole pictures unless you absolutely must. Moving pixels costs time.
- Allocate an output picture and render directly into it.
- If you need more than one pass, do in-place processing in the subsequent ones if possible.
- If your processing cannot avoid multiple passes, consider doing all passes for a picture line (or a few) at a time, instead of iterating over the whole picture separately for each pass. This will help to keep down the pressure on the CPU cache.
- Consider writing the inner processing loops in vectorized inline assembly, such as MMX. This can often gain a factor of 2-8x in speed, depending on how much and what kind of processing your algorithm needs.
Some notes on video filter API
Note: See modules/video_filter/deinterlace/deinterlace.c and modules/video_filter/deinterlace/deinterlace.h for an example of everything discussed in this section. See also the Deinterlace Hacker's Guide for some further details.
The video filters use an object-based model. Calls to the filter will come with an instance pointer, so each filter instance can keep local data.
Usually, each filter defines its own internal data structure called filter_sys_t. It is completely private to the filter.
The API for video filters in VLC 1.2 is called video filter2. This requires the functions Open() and Close(). The Open() function is expected to allocate private data (if any), and set up the filter structure with pointers to a frame processing function (p_filter->pf_video_filter, which is the actual filter), a flush function (p_filter->pf_video_flush), and a mouse event function (p_filter->pf_video_mouse). The Close() function should deallocate private data.
Filter work (frame processing function)
The frame processing function takes a frame as input (picture_t*, include/vlc_picture.h), and optionally outputs one or more frames (picture_t*) as a linked list. The usual case is one frame in, one frame out, but this is not a requirement. (See framerate doublers and IVTC in the deinterlace module for counterexamples.)
If you do not wish to output a frame at a particular call to your processing function, you can return NULL. If you wish to output several, make a linked list by using the next member of picture_t.
You don't have to necessarily output the same frame (after processing) that just came in. You can keep a private input frame history in your filter_sys_t, pick the base frame for your output from that, and adjust the PTSs (presentation timestamps, member date in picture_t) of your output as you deem necessary. The deinterlacer provides an example of how to do this.
Note however that there may be other filters in the chain that already set up an offset; this leaves less time (until the designated PTS) for the rest of the filters to do their processing.
Frame allocation
Be sure to allocate your output pictures using
picture_t *p_outpic = filter_NewPicture( p_filter );
This function requests a new picture from the private pool (see src/video_output/vout_wrapper.c). This is very important! Be aware that pictures created like
picture_t *p_temp = picture_NewFromFormat( &p_input_pic->format );
lack a shared memory context (in Linux; or the equivalent in other OS), and thus cannot be passed to the video output logic. If your filter attempts to do so, VLC will crash.
If you need temporary pictures, you can allocate those using picture_NewFromFormat(), but remember to always create output pictures using filter_NewPicture(). Here an output picture is defined as a picture which goes out from the filter to the caller.
There is a limit to the number of simultaneous picture slots available in the private pool. Currently ([71992c5fbc75ee2c94cd2a3ab1aaca70bfc688a9]) this is 3 pictures for the filter layer. See constant private_picture in src/video_output/vout_wrapper.c. If you try to allocate more output pictures than this (during a single processing call), the allocation will fail.
Notes
- In VLC 1.2, the video format (resolution, chroma) never changes on the fly. The whole filter chain is Close()d and then Open()ed again if the format changes.
- Note that it is allowed to flush the filter without closing it. This actually happens under some circumstances, so don't deallocate stuff in your flush function. Rather, allocate dynamic resources in Open(), and deallocate them in Close() (or in private functions called from those).
- Supported input and output formats are left up to each filter. The usual thing is to support YUV formats, e.g. I420, J420, YV12, I422 and J422. See include/vlc_fourcc.h. The most important ones to support are I420 and I422.
Pitch, visible pitch, planes et al.
Pitch and visible pitch
Important to know:
- i_pitch = number of (macro)pixels on one line
- i_visible_pitch = number of (macro)pixels on one line, adjusted for memory alignment constraints etc. (see below)
Note that the pitches may be different for each plane, and indeed in YUV formats the luma (Y_PLANE) and chroma (U_PLANE, V_PLANE) have different pitches due to chroma subsampling. (In 4:2:0 formats, the number of lines differs, too.)
Pitches also tend to slightly differ depending on how the picture was allocated, even if the visible size is the same. The input picture to the filter may have one pitch, temporary pictures (picture_NewFromFormat()) another, and output pictures (filter_NewPicture()) yet another.
Be sure to always use the correct pitch when you handle the pixels of a picture. That is, always use the i_pitch member of the actual plane_t you are working on.
If you absolutely need matching pitches (e.g. if you are glueing in processing code from another GPL-compatible project which assumes this and you don't want to rewrite it...), consider making temporary copies with picture_NewFromFormat(). See modules/video_filter/deinterlace/deinterlace.c (the history mechanism pp_history[]) and modules/video_filter/deinterlace/algo_yadif.c (its usage) for an example.
If you are processing pixels from one picture to another, the safe thing to do is to take the smallest i_visible_pitch, and loop from x = 0 until the visible pitch has been reached, but use the individual i_pitches for computing the pixel locations. See modules/video_filter/deinterlace/helpers.c for examples.
Video_format_t vs. Plane_t
In video_format_t, the i_visible_width and i_visible_height members go with i_x_offset and i_y_offset, but they are not related to plane_t::i_visible_*.
The plane_t::i_visible_* fields are there to know which part of the memory planes can at most be displayed. Usually these are different from the whole surface due to memory alignment constraints (like mod 16 for SSE2 or to be aligned on a macroblock). Those values are decided when the picture is allocated and they usually don't change for a given picture pointer (but they can change, like in direct3d buffers).
The video_format_t:i_visible_width/height and i_x/y_offset are more a 'hint' regarding what will be actually displayed. A filter should update filter_t::fmt_out if needed to ensure that the allocated pictures get the right value. A filter should, however, not limit its processing to this area, because VLC supports dynamic cropping.
Thus, the correct thing to do in a filter is to process the whole picture area, or at least the first i_visible_* lines and pixels according to the plane_ts. Filters do not need to care about margin positioning or that there even exists such a thing as margins.
Disclaimer
The information is mostly based on a few months' hacking on the deinterlacer.
This page is still a stub. Please add new relevant stuff.
Also, the stuff that already exists could use better organization and formatting.
Please read the Documentation Editing Guidelines before you edit the documentation