The IDA Cross Reference

Hi Folks, today I would like to point out an important IDA's feature that often is misinterpretated from students or newbies: The IDA cross reference. One of the most common question asked while reverse engineering a binary is: "where is this function called from?" or "what functions access to this data?" or again "what functions are called from the current one ?". IDA Pro (even in its free edition) answers to such questions in a very elegant format. The cross-reference addresses are placed as "pseudo" comments on the most right side of the IDA View-A (the non graphical one).

The basic syntax is the following one:

{Code|Data}xref:[base]+[offset][up arrow| down arrow][type of ref.]

The first element defines if the reference is on the Code segment or in the Data segment, respectively if it's a function or a variable. the [base] defines the base address, for instance it could be _main or psum. The offset is where you can find it inside the base function. The up/down arrow helps you in scrolling the code underlining where the reference is. Finally each cross reference has a particular type depending on who called it. For example the type could be: ordinary flow (o) if it represents the sequential flow from one instruction to another, jump flow (j) if it is triggered from an unconditional and conditional branch and call flow (p) indicating the transfer of control to a target function. If the cross reference refers to data it could be for reading data (r) or for writing data (w).

The following example (click on it to make it bigger), shows that the string "Good Work!" (terminated) is used from sub_401334 at offset 2h. It says that the function is place over the current code and that it's an ordinary transfer flow.

Quite obviously IDA pro has a more nice and user friendly interface to explode cross references. The following buttons trigger these functions:

These buttons trigger the IDA's functions to generate the calling tree and the called tree. Which basically answer to the previous questions: "who called this function" and "Who calls this function". For example the following tree represents the functions that call sub_401334 which is the one that uses the monitored good job! string.

From this tree we now know how to reach the monitored string. Summing up the "Good work!" string is called from sub_401334, which is called from WndProc which is called from start. Pretty easy in this particular case, right ?. Contrary, often the reality is pretty much harder that that, but it makes me the point. Applying this concept the other way around IDA generates all the possible function calls starting from a specified function. This is another very interesting view from which the reverser could learn much on the behavior and on the structure of the analyzed binary. But maybe I let this topic for another post.