Intro
What is under the hood of printf
in glibc? WHERE and HOW does printf
get arguments if only the format string provided, i.e. format string vulnerability? I’d like to share my findings and open up a discuss on it, as this question has puzzled me for a long time.
Leave a comment for me if you have further thoughts!🤠
Printf and Variable arguments
Printf accepts variable arguments which are indexed in a structure called ap_list
. When printf
calling to vs_printf
which later calls to vsprintf_internal
, the ap_list
address is one of vsprintf_internal
’s argument. You can set a breakpoint at vsprintf_internal
and execute info args
in gdb to print out the ap_list
pointer value.
s
: The output stream (stdout
in this case).format
: The format string used inprintf
.ap
: Theva_list
containing the variable arguments.mode_flags
: Additional flags (not critical for our purpose).
We know that ap_list
is a structure that maintains both register argument list and stack arguments list that are candidates for format string specifiers.
The register argument list contains copy of register arguments before they are later used by other purposes.
Whenever a format specifier(like %s) is encountered in the format string, vsprintf_internal
will consume an argument from either of the list maintained in the ap_list
.
1 | typedef struct { |
We can inspect ap_list
structure and both lists by below when paused at the vsprintf_internal
:
1 | x/16gx $ap |
1 | x/16gx 0x7fffffffd9f0 // stack arguments list |
And if you are trying to conduct a leak attack, like printf("%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p")
, it will output
1 | 0x555555559830.0x555555559830.(nil).(nil).0x1.0x555555559830.(nil).0x2e70252e70252e70.(nil).0x7fffffffda80.0x555555555b1a.0x5555555592a0.0xa00000000.(nil).(nil).(nil). |
You will see the leaked memory values are from both the stack arguments and register arguments list stored in different memory locations.
And if you are curious why the first register argument 0x555555557dd8
is not leaked, don’t forget the format string is also an argument and yes, this address is where the format string is stored at and it had already been consumed by the printf so no leak.
Reference
Wenliang Du
ChatGPT