SAMI Is My Hero: MS08-033 Disassembled

My name is Bow Sineath and I have recently joined the SecureWorks Counter Threat Unit (CTU) as a security researcher. During my previous employment, I managed an IDS/IPS signature set and was responsible for acting on vulnerability intelligence that was, more often than not, very limited in public details. My experience in reverse engineering, source code analysis and countermeasure development is assisting SecureWorks in developing countermeasures that accurately protect our clients.

For my initial blog post as a CTU researcher, I am going to detail a vulnerability from Microsoft's June 2008 patch cycle and correct an error I have seen in a number of the publicly available countermeasures. The vulnerability exists in the way that Microsoft DirectX 7 and 8 handle the parsing of Synchronized Accessible Media Interchange (SAMI) files. Due to the fact that Microsoft 7 and 8 are only distributed with Windows 2000, all other modern variants of Windows are not vulnerable to this specific issue, which limits the scope of this vulnerability significantly. The Microsoft advisory identifier for this vulnerability is MS08-033.

The SAMI file format is a captioning technology developed by Microsoft and detailed here (http://msdn.microsoft.com/en-us/library/ms971327.aspx). The format has a series of tags, similar to HTML, that describe and print out captions. The files can be identified by both the file contents which are plain-text and, in most cases, the .smi file extension. When SAMI is used, the SAMI file describing the captions exists as a separate entity from the media, which means there will be two files, the media itself and the SAMI file providing the captions. The file format itself is relatively similar to HTML in that it uses tags and markup similar to HTML and CSS. Further details on the file format are available from Microsoft.

The vulnerability we will be analyzing exists in the way that DirectX 7 and 8 prior to MS08-033 handle class declarations in SAMI files. The SAMI file format requires that two headers exist, the SAMIParam block and the Style block, both of which provide metadata for the SAMI file and the captions. Within the Style block, classes can be declared and defined that give each caption the ability to perform language-specifications. A class can have a name, a language and formatting information associated with it. When classes are declared, they are done so with a period (.) followed by a sequence of alphanumeric characters, which will be the name used to associate a caption with a class. Following the class name, an open bracket designates the start of the class definition. Within the definition lies the meat of the class, including a class name to be presented to the user and a language selection, which by definition should follow the ISO639-ISO3166 naming convention. In addition, the author can specify formatting options within the class definition. The final portion of the definition is a close bracket, designating the end of the definition. An example declaration and definition could look like this:

.F00  { Name:"Foo Class"; Lang: en-US; color: white; }

Once the two header blocks have been defined, the captions can be defined. A basic SAMI caption consists of two tags, the first being SYNC, which defines the time in milliseconds to display the caption and the second being P, which allows the author to specify a class and ID for formatting. A typical caption definition could look like this:

    <SYNC  Start=10>
    <P  Class=F00> Hello there

In addition to class definitions, a caption can be formatted within the style block by using either a Paragraph block or a Source block. Both of these formatting types are documented in the Microsoft SAMI specification and work somewhat similar to classes.

The vulnerability lies in the way that quartz.dll applies style information to captions and exists in the CSAMIRead::FillBuffer() function. Within the CSAMIRead::FillBuffer() function, there are multiple calls to functions such as lstrcpyn and wsprintf, which are functions that can easily be misused. The use of these functions, in combination with a failure to properly track the size of data copied into a buffer, results in a trivially exploitable stack overflow.

At the start of CSAMIRead::FillBuffer(), two stubs are called which essentially obtain values from a structure and return them. Both of the function calls are virtual, so the target of the calls is not immediately clear.

There are several ways the target of virtual function calls can be determined. In this case the quickest and easiest way to determine the target of the calls is by setting a breakpoint on CSAMIRead::FillBuffer() and running it in a debugger. This can be done by attaching a debugger to Windows media player and setting the debugger to break on library load, and setting breakpoints within quartz.dll after the library has been loaded. The two functions are CMediaSample::GetSize() and CMediaSample::GetPointer() and are both virtual member functions of the IMediaSample COM object. The block of code which makes these calls is below.

  .text:35584819                 mov     edi, [ebp+arg_0]
  .text:3558481C                 mov     esi, ecx
  .text:3558481E                 push    edi
  .text:3558481F                 mov     eax, [edi]
  .text:35584821                 call    dword ptr [eax+10h] ;

                                         CMediaSample::GetSize()
  .text:35584824                 mov     eax, [edi]
  .text:35584826                 lea     ecx, [ebp+var_8]
  .text:35584829                 push    ecx
  .text:3558482A                 push    edi
  .text:3558482B                 call    dword ptr [eax+0Ch] ;

                                         CMediaSample::GetPointer()
  .text:3558482E                 test    eax, eax
  .text:35584830                 jge     short loc_35584842 ;

                                         if(GetPointer() != 0)  return error

It is important to note in this block of code that the return value of the call to CMediaSample::GetSize() is discarded, which makes calling this function useless since the only purpose it serves is to return the size of the buffer. Moving a few blocks down into the function, checks are performed to see if certain fields were initialized in the SAMI file. This block of code is below.

  .text:3558486A                 mov     ecx, [esi+494h]  ;

                                         Pointer to SOURCE= object
  .text:35584870                 test    ecx, ecx
  .text:35584872                 jz      short loc_3558487D
  .text:35584874                 mov     edi, ecx
  .text:35584876                 mov     ecx, offset Default ; ""
  .text:3558487B                 jmp     short loc_35584884
  .text:3558487D ;  ---------------------------------------------------------------------------
  .text:3558487D
  .text:3558487D  loc_3558487D:  ;  CODE XREF: CSAMIRead::

                                 FillBuffer(IMediaSample *,ulong,ulong *)+61j
  .text:3558487D                 mov     ecx, offset Default  ; ""
  .text:35584882                 mov     edi, ecx
  .text:35584884
  .text:35584884  loc_35584884:  ;  CODE XREF: CSAMIRead::

                                 FillBuffer(IMediaSample *,ulong,ulong *)+6Aj
  .text:35584884                 mov     eax, [eax+4]    ;
  .text:35584884                 ;  Pointer to class definition
  .text:35584887                 test    eax, eax
  .text:35584889                 mov     edx, eax
  .text:3558488B                 jnz     short loc_3558488F
  .text:3558488D                 mov     edx, ecx
  .text:3558488F
  .text:3558488F  loc_3558488F:  ;  CODE XREF: CSAMIRead::

                                 FillBuffer(IMediaSample *,ulong,ulong *)+7Aj
  .text:3558488F                 mov     eax, [esi+498h] ;
  .text:3558488F                 ; Pointer to Paragraph style
  .text:35584895                 test    eax, eax
  .text:35584897                 jz      short loc_3558489B
  .text:35584899                 mov     ecx, eax

The three pointers shown here are going to be pushed as arguments to a wsprintf() call, which means they must be given some value before being passed to wsprintf(). The block of code above checks the pointers to the locations of the respective objects in memory to see if they are NULL or not, if the pointer is NULL then a pointer to an empty string is copied into the register used for an argument to wsprintf(), otherwise the pointer to the object is passed into wsprintf(). This seems a bit complicated, but seeing the wsprintf() call will make it clear.

  .text:3558489B  loc_3558489B:  ; CODE XREF:  CSAMIRead::

                                         FillBuffer(IMediaSample *,

                                         ulong,ulong *)+86j
  .text:3558489B                 push    edi
  .text:3558489C                 push    edx
  .text:3558489D                 push    ecx
  .text:3558489E                 push    offset aPStyleHsHsHs ;

                                         "<P  STYLE="%hs %hs %hs">"
  .text:355848A3                 push    [ebp+var_8]     ; LPSTR
  .text:355848A6                 call    ebx ; __imp__wsprintfA
  .text:355848A8                 mov     edi, eax
  .text:355848AA                 mov      eax, [esi+49Ch]
  .text:355848B0                 add     esp, 14h
  .text:355848B3                 xor     ecx, ecx
  .text:355848B5                 mov     eax, [eax+20h]
  .text:355848B8                 mov     [ebp+var_C], ecx
  .text:355848BB                 test    eax, eax
  .text:355848BD                 mov     [ebp+var_4], eax
  .text:355848C0                 jz      short loc_355848E7

According to the stdcall calling convention, arguments are pushed in reverse order onto the stack, so in the call above edi points to the source object of the caption, edx points to the class definition, and ecx points to the Paragraph style block. In effect, this block of code is replacing the variable names specified in the SAMI source, class and style objects inside of each caption. Due to the fact that the wsprintf() function does not provide any means of bounds checking, any of the three strings passed into this wsprintf() call can be used to trigger the vulnerability. The code that follows contains a series of lstrcpyn() and wsprintf() calls, all of which can be used to trigger the vulnerability as well.

Most countermeasures for this vulnerability do not provide complete protection against this vulnerability and only account for certain portions of class definitions, completely ignoring the Source and Paragraph blocks and missing the fact that there do not need to be valid identifiers within class definitions for the vulnerability to be triggered. This vulnerability is one of many that underscores the importance of reverse engineering patches and creating internal proof of concepts for vulnerabilities when creating countermeasures.

Tags:
Research
Blog

Secureworks has been acquired by Sophos. To view all new blogs, including those on threat intelligence from the Counter Threat Unit, visit: https://news.sophos.com/en-us/.

ABOUT THE AUTHOR

BOW SINEATH

Back to all Blogs