Compiler Datatype Organization

<data_organization>

Attributes and Children
<absolute_max_alignment> (Optional) Maximum alignment possible across all datatypes (0 indicates no maximum)
value
<machine_alignment> (Optional) Maximum useful alignment for the underlying architecture
value
<default_alignment> (Optional) Default alignment for any datatype that isn't structure, union, array, or pointer and whose size isn't in the size/alignment map
value
<default_pointer_alignment> (Optional) Default alignment for a pointer that doesn't have a size
value
<pointer_size> (Optional) Size of a pointer
value
<pointer_shift> (Optional) Left-shift amount, in bits, for shifted pointer datatypes
value
<wchar_size> (Optional) Size of "wchar", the wide character datatype
value
<short_size> (Optional) Size of "short" and other short integer datatypes
value
<integer_size> (Optional) Size of "int" and other integer datatypes
value
<long_size> (Optional) Size of "long" and other long integer datatypes
value
<long_long_size> (Optional) Size of "longlong" integer datatypes
value
<float_size> (Optional) Size of "float" and other floating-point datatypes
value
<double_size> (Optional) Size of "double" and other double precision floating-point datatypes
value
<long_double_size> (Optional) Size of "longdouble" floating-point datatypes
value
<size_alignment_map> (Optional) Size to alignment map

The <data_organization> tag provides information about the sizes of core datatypes and how the compiler typically aligns datatypes. These are required so analysis can determine the proper in-memory layout of datatypes, such as those described by C/C++ style header files. Both sizes and alignments are specified in bytes by using the integer value attribute in the corresponding tag. An alignment value indicates that the compiler chooses a byte address that is a multiple of that value as the start of that datatype. A value of 1 indicates no alignment. Most atomic datatypes get their alignment information from the <size_alignment_map>. If the size of a particular datatype isn't listed in the map, the <default_alignment> value will be used.

<size_alignment_map>

Attributes and Children
<entry> (0 or more) Alignment information for a particular size
size Size of datatype in bytes
alignment The alignment value

Each <entry> maps a specific size to a specific alignment. Ghidra satisfies requests for the alignment of all atomic datatypes (except pointers) by consulting this map. If it doesn't contain the particular size, Ghidra reverts to the <default_alignment> subtag in the parent <data_organization> tag. Its typical to only provide alignments for sizes which are a power of 2.

Example 9.

  <data_organization>
     <absolute_max_alignment value="0" />
     <machine_alignment value="2" />
     <default_alignment value="1" />
     <default_pointer_alignment value="4" />
     <pointer_size value="4" />
     <wchar_size value="4" />
     <short_size value="2" />
     <integer_size value="4" />
     <long_size value="4" />
     <long_long_size value="8" />
     <float_size value="4" />
     <double_size value="8" />
     <long_double_size value="12" />
     <size_alignment_map>
          <entry size="1" alignment="1" />
          <entry size="2" alignment="2" />
          <entry size="4" alignment="4" />
          <entry size="8" alignment="4" />
     </size_alignment_map>
  </data_organization>

<enum>

Attributes and Children
size Default size of an enumerated datatype
signed (Optional) true or false : Is an enumeration viewed as a signed integer

This is a deprecated tag.

<funcptr>

Attributes and Children
align Number of alignment bytes for functions

Some compilers rely on the alignment of code addresses to provide extra bits of space in function pointers where extra internal information can be stored. On ARM chips in particular, the processor itself supports an ARM/THUMB transition bit in code addresses, which are always at least 2 byte aligned. This tag informs the decompiler of this region of encoding in function pointers so that it can filter it out, allowing it to find the correct address in various situations. The align attribute should always be a power of 2 corresponding to the number of bits a compiler might use for additional storage.

Example 10.

  <funcptr align="2"/>