Storage for decoding and storing strings associated with an address.  
 More...
#include <stringmanage.hh>
 | 
| static bool  | hasCharTerminator (const uint1 *buffer, int4 size, int4 charsize) | 
|   | Check for a unicode string terminator.  More...
  | 
|   | 
| static int4  | readUtf16 (const uint1 *buf, bool bigend) | 
|   | Read a UTF16 code point from a byte array.  More...
  | 
|   | 
| static void  | writeUtf8 (ostream &s, int4 codepoint) | 
|   | Write unicode character to stream in UTF8 encoding.  More...
  | 
|   | 
| static int4  | getCodepoint (const uint1 *buf, int4 charsize, bool bigend, int4 &skip) | 
|   | Extract next unicode codepoint.  More...
  | 
|   | 
Storage for decoding and storing strings associated with an address. 
Looks at data in the loadimage to determine if it represents a "string". Decodes the string for presentation in the output. Stores the decoded string until its needed for presentation. 
 
◆ StringManager()
      
        
          | StringManager::StringManager  | 
          ( | 
          int4  | 
          max | ) | 
           | 
        
      
 
Constructor. 
- Parameters
 - 
  
    | max | is the maximum number of characters to allow before truncating string  | 
  
   
 
 
◆ getCodepoint()
  
  
      
        
          | int4 StringManager::getCodepoint  | 
          ( | 
          const uint1 *  | 
          buf,  | 
         
        
           | 
           | 
          int4  | 
          charsize,  | 
         
        
           | 
           | 
          bool  | 
          bigend,  | 
         
        
           | 
           | 
          int4 &  | 
          skip  | 
         
        
           | 
          ) | 
           |  | 
         
       
   | 
  
static   | 
  
 
Extract next unicode codepoint. 
One or more bytes is consumed from the array, and the number of bytes used is passed back. 
- Parameters
 - 
  
    | buf | is a pointer to the bytes in the character array  | 
    | charsize | is 1 for UTF8, 2 for UTF16, or 4 for UTF32  | 
    | bigend | is true for big endian encoding of the UTF element  | 
    | skip | is a reference for passing back the number of bytes consumed  | 
  
   
- Returns
 - the codepoint or -1 if the encoding is invalid 
 
 
 
◆ getStringData()
  
  
      
        
          | virtual const vector<uint1>& StringManager::getStringData  | 
          ( | 
          const Address &  | 
          addr,  | 
         
        
           | 
           | 
          Datatype *  | 
          charType,  | 
         
        
           | 
           | 
          bool &  | 
          isTrunc  | 
         
        
           | 
          ) | 
           |  | 
         
       
   | 
  
pure virtual   | 
  
 
Retrieve string data at the given address as a UTF8 byte array. 
If the address does not represent string data, a zero length vector is returned. Otherwise, the string data is fetched, converted to a UTF8 encoding, cached and returned. 
- Parameters
 - 
  
    | addr | is the given address  | 
    | charType | is a character data-type indicating the encoding  | 
    | isTrunc | passes back whether the string is truncated  | 
  
   
- Returns
 - the byte array of UTF8 data 
 
Implemented in StringManagerUnicode, and GhidraStringManager.
 
 
◆ hasCharTerminator()
  
  
      
        
          | bool StringManager::hasCharTerminator  | 
          ( | 
          const uint1 *  | 
          buffer,  | 
         
        
           | 
           | 
          int4  | 
          size,  | 
         
        
           | 
           | 
          int4  | 
          charsize  | 
         
        
           | 
          ) | 
           |  | 
         
       
   | 
  
static   | 
  
 
Check for a unicode string terminator. 
- Parameters
 - 
  
    | buffer | is the byte buffer  | 
    | size | is the number of bytes in the buffer  | 
    | charsize | is the presumed size (in bytes) of character elements  | 
  
   
- Returns
 - true if a string terminator is found 
 
 
 
◆ isString()
Returns true if the data is some kind of complete string. A given character data-type can be used as a hint for the encoding. The string decoding can be cached internally. 
- Parameters
 - 
  
    | addr | is the given address  | 
    | charType | is the given character data-type  | 
  
   
- Returns
 - true if the address represents string data 
 
 
 
◆ readUtf16()
  
  
      
        
          | int4 StringManager::readUtf16  | 
          ( | 
          const uint1 *  | 
          buf,  | 
         
        
           | 
           | 
          bool  | 
          bigend  | 
         
        
           | 
          ) | 
           |  | 
         
       
   | 
  
inlinestatic   | 
  
 
Read a UTF16 code point from a byte array. 
Pull the first two bytes from the byte array and combine them in the indicated endian order 
- Parameters
 - 
  
    | buf | is the byte array  | 
    | bigend | is true to request big endian encoding  | 
  
   
- Returns
 - the decoded UTF16 element 
 
 
 
◆ restoreXml()
Restore string cache from XML. 
Read <stringmanage> tag, with <string> sub-tags. 
- Parameters
 - 
  
    | el | is the root tag element  | 
    | m | is the manager for looking up AddressSpaces  | 
  
   
 
 
◆ saveXml()
      
        
          | void StringManager::saveXml  | 
          ( | 
          ostream &  | 
          s | ) | 
           const | 
        
      
 
Save cached strings to a stream as XML. 
Write <stringmanage> tag, with <string> sub-tags. 
- Parameters
 - 
  
    | s | is the stream to write to  | 
  
   
 
 
◆ writeUtf8()
  
  
      
        
          | void StringManager::writeUtf8  | 
          ( | 
          ostream &  | 
          s,  | 
         
        
           | 
           | 
          int4  | 
          codepoint  | 
         
        
           | 
          ) | 
           |  | 
         
       
   | 
  
static   | 
  
 
Write unicode character to stream in UTF8 encoding. 
Encode the given unicode codepoint as UTF8 (1, 2, 3, or 4 bytes) and write the bytes to the stream. 
- Parameters
 - 
  
    | s | is the output stream  | 
    | codepoint | is the unicode codepoint  | 
  
   
 
 
The documentation for this class was generated from the following files: