A DirectShow Tutorial

Displaying an image sequence

In order to process image sequences (from files or from a camera), you have to use DirectShow. The DirectShow architecture that is part of Microsoft DirectX relies on a filter architecture. There are three types of filters: source filters that output video and/or audio signals, transform filters that process an input signal and produce one (or several) output and finally rendering filters that display or save a media signal. The processing of a sequence is therefore done using a series of filters connected together; the output of one filter becoming the input of the next one (you can also have filters with multiple outputs). The first filter is usually a decompressor that reads a file stream and the last filter could be a renderer that displays the sequence in a window. In the DirectShow terminology, a series of filters is called a filter graph.

We will first try to process an AVI sequence. Lets first see if DirectX is working fine. To do so, just use the GraphEdit application. This a very useful application included in the DirectX SDK that makes easy the building of filter graphs. It can be started from the Start|Programs|Microsoft DirectX 8.1 SDK|DirectX Utilities menu. The GraphEdit application window will pop up.

Our objective is now to visualize the building blocks required to obtain an AVI renderer. Select Graph|Insert Filters... A window will display the list of available filters. Choose the DirectShow Filters tree and select the File Source(Async.) filter.

You will be asked to select an AVI file. The filter will appear in the GraphEdit window in the form of a box. Right-click on the output pin and select the Render Pin option. This is an intelligent option that will determine what filters are required to render the selected source file and will automatically assemble them together as shown here:

For an AVI sequence, the video renderer should be composed of 3 filters. The first one is the splitter that separates the video and audio components; this filter normally has two outputs (video and audio) but note that in the case of the selected sequence, no audio component was available. The second one is the appropriate decompressor that decodes the video sequence. Finally, the third filter is the renderer itself that creates the window and that displays the frame sequence in it. Just push the play button to execute the graph and the selected AVI sequence should be displayed in a window. We can build the same filter graph using Visual C++. You first need to include the following include path in your project settings:
C:\DXSDK\samples\Multimedia\DirectShow\BaseClasses
And the following library path:
C:\DXSDK\lib 
Finally add the following library:
STRMBASE.LIB
DirectX is implemented using the Microsoft COM technology. This means that when you want to do something, you do it by using a given COM interface. In order to initialize the COM layer, you must call:
      CoInitialize(NULL);
And similarly, when you are done with COM, you need to uninitialize it:
      CoUninitialize();
A COM interface is an abstract class containing pure virtual functions (forming together the interface). Using a COM interface is the only way to communicate with a COM object. They are obtained by calling the appropriate API function. These functions return a value of type HRESULT representing an error code. The simplest way to verify whether a COM call failed or succeeded is to check the return value using the FAILED macro. All COM interface derives from the IUnknown interface.

A very important rule when you use an interface is to never forget to release it after you have finished to use it otherwise it will result resource leaks. This is done by calling the Release method of the IUnknown interface which decrements the object's reference count by 1; when the count reaches 0, the object is deallocated. The safest way to call the Realease method is to use the macro SAFE_RELEASE that can be found in dxutil.h located in C:\DXSDK\samples\Multimedia\Common\include This macro is simply defined as:

#define SAFE_RELEASE(p) { if(p){(p)->Release();(p)=NULL;}}
To use a component of DirectX, you must first call its top-level interface. These are identified by a CLSID identifier and each interface is identified by an IID. For example, to create a DirectShow filter graph (use to build a series of filters) you call:
IGraphBuilder *pGraph;
CoCreateInstance(CLSID_FilterGraph, // object identifier 
                 NULL, CLSCTX_INPROC, 
                 IID_IGraphBuilder, // interface identifier 
                 (void **)&pGraph); // pointer to the 
                                    // top-level interface
To request the other interfaces of this object, you use QueryInterface method. For example:
      
pGraph->QueryInterface(
    IID_IMediaControl, // interface identifier
    void **)&pMediaControl); // pointer to the interface
Once the filter graph is created, it becomes easy to create all the filters required to render an AVI file. This is done by calling
pGraph->RenderFile(MediaFile, NULL);
This call does what the Render Pin option do in the GraphEdit application. To play the video, two more interfaces are required the IMediaControl that is used to start the playback and the IMediaEvent used to catch when the stream rendering has completed. Here is the complete class:
class SequenceProcessor {
 
  IplImage* img; // Declare IPL/OpenCV image pointer
  IGraphBuilder *pGraph;
  IMediaControl *pMediaControl;
  IMediaEvent *pEvent;

  public:
    
   SequenceProcessor(CString filename, bool display=true) {
    
      CoInitialize(NULL);

      pGraph= 0;

      // Create the filter graph
      if (!FAILED(  
            CoCreateInstance(CLSID_FilterGraph, 
                             NULL, CLSCTX_INPROC, 
                             IID_IGraphBuilder, 
                             (void **)&pGraph))) {

        // The two control interfaces
        pGraph->QueryInterface(IID_IMediaControl, 
                               (void **)&pMediaControl);
        pGraph->QueryInterface(IID_IMediaEvent, 
                               (void **)&pEvent);

        // Convert Cstring into WCHAR*
        WCHAR	*MediaFile= 
                  new WCHAR[filename.GetLength()+1];
        MultiByteToWideChar(CP_ACP, 0, 
                            filename, -1, MediaFile,                 
                            filename.GetLength()+1);

        // Create the filters
        pGraph->RenderFile(MediaFile, NULL);

        if (display) {

          // Execute the filter
          pMediaControl->Run();

          // Wait for completion. 
          long evCode;
          pEvent->WaitForCompletion(INFINITE, &evCode);
        }
      }
    }

    ~SequenceProcessor() {

      // Do not forget to release after use
      SAFE_RELEASE(pMediaControl);
      SAFE_RELEASE(pEvent);
      SAFE_RELEASE(pGraph);

      CoUninitialize();
    }
};
When an AVI file is selected, a rendering filter is created and the sequence is displayed. To have an idea of what filters have been created, we can enumerate them by adding the following member function to our class:
std::vector<CString> enumFilters() {

  IEnumFilters *pEnum = NULL;
  IBaseFilter *pFilter;
  ULONG cFetched;
  std::vector<CString> names;

  pGraph->EnumFilters(&pEnum);
 
  while(pEnum->Next(1, &pFilter, &cFetched) == S_OK)
  {
    FILTER_INFO FilterInfo;
    char szName[256];
    CString fname;

    pFilter->QueryFilterInfo(&FilterInfo);
    WideCharToMultiByte(CP_ACP, 0, FilterInfo.achName, 
                        -1, szName, 256, 0, 0);
    fname= szName;
    names.push_back(fname);

    SAFE_RELEASE(FilterInfo.pGraph);
    SAFE_RELEASE(pFilter);
  }

  SAFE_RELEASE(pEnum);

  return names;
}
This method simply creates a vector of strings (you have to include <vector>) containing the names of the filters associated with the generated filter graph. This name is obtained by reading the FILTER_INFO structure. The enumeration is obtained by calling the method EnumFilter of the FilterGraph instance. Note how all interfaces are released, including the one indirectly obtained through FILTER_INFO that also contains a pointer to the associated filter graph.

To display the filter names, we add a CListBox to the dialog. Do not forget to add a control member variable to this list. This can be done using the Class Wizard of the View menu. Select the Member Variables tab and then select the control ID that corresponds to the ClistBox (the name should be IDC_LIST1). Click on Add Variable... button, call the variable m_list ; its Category must be Control. The m_list variable is now available as a member variable of the dialog class. The filter names are added to this list by changing the OnOpen method as follows:

void CCvisionDlg::OnOpen() 
{
  CFileDialog dlg(TRUE, _T("*.bmp"), "",                
   OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,
   "image files (*.bmp; *.jpg) |
     *.bmp;*.jpg|AVI files (*.avi) |
     *.avi|All Files (*.*)|*.*||",NULL);

  char title[]= {"Open Image"};
  dlg.m_ofn.lpstrTitle= title;

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();
    CString ext= dlg.GetFileExt();

    if (proc != 0)
      delete proc;

    if (procseq != 0)
      delete procseq;

    if (ext.Compare("avi")) {

      proc= new ImageProcessor(path);

    } else {

      procseq= new SequenceProcessor(path);

      // Obtaining the list of filters
      std::vector<CString> names= procseq->enumFilters();

      m_list.ResetContent();
      for (int i=0; i<names.size(); i++)
        m_list.AddString(names[i]);
    }
  }
}
and now if you open an AVI file, you can see the filter list:

Check point #1: source code of the above example.

Top of the page

 

(c) Robert Laganiere 2011