When I first sought to understand the symbol table and the global offset table (GOT) I found bits and pieces of information, but I had trouble getting the whole picture. As I understood what the symbol table/GOT are, I realized it is easier to describe the symbol table/GOT in the context of the linking and loading process for which they are used. That’s what this post does. It will explain the why of the symbol table/GOT to help you understand them in context.
Most of the credit goes to the authors of the posts of which this one is an amalgamation of. This is more a collection of pieces of information to hopefully paint a clearer picture of the whole.
The Linking Process
Object files contain references to each other’s code and data. Due to this, the linker must combine them at link time. After linking all of the object files together, the linker uses the relocation records to find all of the addresses that need to be filled in.
The Symbol Table
Since assembling to machine code removes all traces of labels from the code, the object file format has to keep these around in a different place. It does this in the form of the symbol table, a list of names and their corresponding offsets in the text and data segments. (Source)
To recap an important concept, an executable file is made up of several object files. You might have two object files and a c library that are all combined by the linker at link time into one executable file.
Most systems run a number of programs at any given time. If you’re familiar with programming, it probably comes as no surprise to you that these programs each use many of the same libraries. For example, many programs use the standard C library which exports functions like printf and malloc. Naturally, we must then have a copy of the C library within the running memory of each of these programs. After all I said earlier that we combine object files and libraries to create executable files. However, this is a mammoth waste of resources so instead each program has a reference to this common library instead of each program having a copy of the library.
Static Linking vs Dynamic Linking
In a statically linked scenario a program and the particular library it is using are combined by the linker at link time. By contrast, a dynamically linked library (in Windows a .dll file and in Linux a .so file) is linked when the executable runs.
The linker binds statically linked libraries with the program at link time (which comes directly after compilation/assembly). The largest advantage of static linking is that you can be certain what version of the library is present. This means that DLL Hell/Depedency Hell isn’t a problem for statically linked executables. This also means the executable exists as a single file rather than several files. Additionally, statically linked executables only contain those parts of the library it needs to execute whereas dynamically linked libraries must load the entire library at runtime because it is not known in advance which functions the application will invoke.
On the downside, statically linked executables are much larger because they carry with them all of their library code. Additionally, in order to update the executable you must recompile/link it.
The term ‘dynamically linked’ means that the program and the particular library it references are not combined together by the linker at link time. Instead, the linker places information into the executable that tells the loader which shared object module the code is in and which runtime linker should be used to find and bind the references. (Source) This means that the linker finds the shared object and binds it to the executable and binds it at runtime. This type of program is also called a partially bound executable because it isn’t fully bound at link time. The linker did not resolve all the referenced symbols at link time. Instead the linker made a reference to the shared object and placed those in the executable. There are four main advantages to using dynamically linked executables.
- The executable is smaller
- Libraries may be upgraded or patched without having to relink all of the executables which depend on them. In the same vein, you don’t have to distribute the source code of the libraries – you only need the compiled binary version.
- Programmers must only deliver the unique libraries with their code. The programmer may assume that standard libraries will already be on the system.
- When combined with virtual memory, dynamic linking permits two or more processes to share read-only executables such as the standard C library or the kernel. This means memory must only retain one copy of the executable in memory rather than one for each process.
The Executable and Linkable Format (ELF) File Format
I’ll start by saying if you’re on Windows you’ll be using the PE/COFF file format. Most of the principles explained here conceptually port over to the PE/COFF format.
In order to fully understand shared objects, the symbol table and the GoT, you have to understand the ELF file format. The ELF specification defines the layout of an object file and its subsequent executable. It is the way we standardize the executables across systems, typically in the case of the ELF format, Linux systems. The ELF file format is fairly complicated and you can read about it in extreme detail here. In this post, I will settle for the parts relevant to the symbol table and the GoT.
Section vs Segment
Within the ELF format there are two ways to view the object file/executable, either the linking view or the execution view. Below is a diagram of the comparison
ELF uses the link view at static linking time for relocatable file combination and the execution view at run time to load and execute programs. The linking view by and large deals with sections whereas the execution view deals with segments. Sections provide the information needed at link time and segments the information needed at runtime.
Sections have a name and type, a requested memory location at run time, and permissions. You can locate the sections by examining the section header table. Each section has:
- One section header describing it. Section headers may exist without a section.
- Each section occupies one contiguous (possibly empty) sequence of bytes in a file.
- Will not overlap
- May have inactive space. The various headers and the sections might not cover every byte in an object file.
Segments group related sections. For example, the text segment groups executable code, the data segment groups the program data, and the dynamic segment groups information relevant to dynamic loading. Each section consists of one or more sections. In this post, we are primarily interested in the PT_DYNAMIC type segment.
Process Image and the Dynamic Linker
The process image is created by loading and interpreting the segments. When building an executable file that uses dynamic linking, the link editor adds a program header element of type PT_INTERP to an executable file, telling the system to invoke the dynamic linker as the program interpreter. The dynamic linker creates the process image for a program. At link time, the program or library is built by merging together sections with similar attributes into segments. Typically, all the executable and read-only data sections are combined into a single text segment, while the data and BSS are combined into the data segment. These segments are normally called load segments, because they need to be loaded in memory at process creation. Other sections such as symbol information and debugging sections are merged into other, non-load segments. (Source)
Creating the process image entails the following activities (source):
- Adding the executable file’s memory segments to the process image
- Adding shared object memory segments to the process image
- Performing relocations for the executable file and its shared objects
- Closing the file descriptor that was used to read the executable file, if one was given to the dynamic linker
- Transferring control to the program, making it look as if the program had received control directly form exec(BA_OS)
There are three sections we care about specifically in this post:
- .dynamic: The structure residing at the beginning of the section holds the addresses of other dynamic linking information.
- .got and .plt (procedure linkage table): .got stores the addresses of system functions and the .plt stores indirect links into the GoT
Shared objects may occupy virtual memory addresses that are different from the addresses recorded in the file’s program header table. The dynamic linker relocates the memory image, updating absolute addresses before the application gains control. Although the absolute address values would be correct if the library were loaded at the addresses specified in the program header table, this normally is not the case.
The Global Offset Table (GOT)
The GOT is a table of addresses which resides in the data section. If some instruction in code wants to refer to a variable it must normally use an absolute memory address. Instead of referring to the absolute memory address, it refers to the GOT, whose location is known. The relative location of the GOT from the instruction in question is constant.
Now you might be thinking, “Great, but I still have to resolve all those addresses within the GOT so what’s the point?” There are two things using the GOT gets us.
- We must relocate every reference in the code section. If everything references in the GOT we only must update the GOT once. This is much more efficient.
- The data section is both writable and not shared between processes. Performing relocations in this section causes no harm whereas in the code section relocations disallow sharing, which defeats the process of a shared library.
Here is an example I pulled from Eli Bendersky’s explanation:
In pseudo-assembly, we replace an absolute addressing instruction:
; Place the value of the variable in edx mov edx, [ADDR_OF_VAR]
With displacement addressing from a register, along with an extra indirection:
; 1. Somehow get the address of the GOT into ebx lea ebx, ADDR_OF_GOT ; 2. Suppose ADDR_OF_VAR is stored at offset 0x10 ; in the GOT. Then this will place ADDR_OF_VAR ; into edx. mov edx, DWORD PTR [ebx + 0x10] ; 3. Finally, access the variable and place its ; value into edx. mov edx, DWORD PTR [edx]
If you would like to see the rest of the process in a high level of detail I strongly suggest taking a look at Eli Bendersky’s under the section titled “PIC with data references through GOT – an example”
This is straightforward enough for global variables, but what about function calls? Theoretically, things could work the same way, but they’re actually a bit more complicated.
The Procedure Linkage Table (PLT)
The PLT is part of the executable text section, containing an entry for each external function the shared library calls. Each PLT entry is a short chunk of executable code. Instead of calling the function directly, the code calls an entry in the PLT, which then calls the actual function. Each entry in the PLT also has a corresponding entry in the GOT which contains the actual offset to the function, but only after the dynamic loader has resolved it.
The PLT uses what is called lazy resolution. It won’t actually resolve the address of a function until it absolutely has to. This makes it so effort is only put into resolving those functions actually used. The process works in the following manner:
- A function func is called and the compiler translates this to a call to func@plt.
- The program jumps to the PLT. The PLT points to the GOT. If the function hasn’t been previously called, the GOT points back into the PLT to a resolver routine, otherwise it points to the function itself.
- If the function hasn’t been previously called, the program jumps back from the GOT to the PLT, which then runs a resolver routine to update the GOT entry with actual address of the function.
The reason we use this lazy initialization is that it saves us the trouble of resolving all the functions that aren’t actually used during runtime.
Again, if you would like to see a specific example, I strongly recommend Eli Bendersky’s article. Look under the section “PIC with function calls through PLT and GOT – an example”