Asked  7 Months ago    Answers:  5   Viewed   29 times

I was wondering how to use GCC on my C source file to dump a mnemonic version of the machine code so I could see what my code was being compiled into. You can do this with Java but I haven't been able to find a way with GCC.

I am trying to re-write a C method in assembly and seeing how GCC does it would be a big help.



If you compile with debug symbols, you can use objdump to produce a more readable disassembly.

>objdump --help
-S, --source             Intermix source code with disassembly
-l, --line-numbers       Include line numbers and filenames in output

objdump -drwC -Mintel is nice:

  • -r shows symbol names on relocations (so you'd see puts in the call instruction below)
  • -R shows dynamic-linking relocations / symbol names (useful on shared libraries)
  • -C demangles C++ symbol names
  • -w is "wide" mode: it doesn't line-wrap the machine-code bytes
  • -Mintel: use GAS/binutils MASM-like .intel_syntax noprefix syntax instead of AT&T
  • -S: interleave source lines with disassembly.

You could put something like alias disas="objdump -drwCS -Mintel" in your ~/.bashrc


> gcc -g -c test.c
> objdump -d -M intel -S test.o

test.o:     file format elf32-i386

Disassembly of section .text:

00000000 <main>:
#include <stdio.h>

int main(void)
   0:   55                      push   ebp
   1:   89 e5                   mov    ebp,esp
   3:   83 e4 f0                and    esp,0xfffffff0
   6:   83 ec 10                sub    esp,0x10
   9:   c7 04 24 00 00 00 00    mov    DWORD PTR [esp],0x0
  10:   e8 fc ff ff ff          call   11 <main+0x11>

    return 0;
  15:   b8 00 00 00 00          mov    eax,0x0
  1a:   c9                      leave  
  1b:   c3                      ret

Note that this isn't using -r so the call rel32=-4 isn't annotated with the puts symbol name. And looks like a broken call that jumps into the middle of the call instruction in main. Remember that the rel32 displacement in the call encoding is just a placeholder until the linker fills in a real offset (to a PLT stub in this case, unless you statically link libc).

Tuesday, June 1, 2021
answered 7 Months ago

You can use stepi or nexti (which can be abbreviated to si or ni) to step through your machine code.

Sunday, June 20, 2021
answered 6 Months ago

The reason why your example code doesn't work is because the "p" constraint is only of a very limited use in inline assembly. All inline assembly operands have the requirement that they be representable as an operand in assembly language. If the operand isn't representable than compiler makes it so by moving it to a register first and substituting that as the operand. The "p" constraint places an additional restriction: the operand must be a valid address. The problem is that a register isn't a valid address. A register can contain an address but a register is not itself an valid address.

That means the operand of the "p" constraint must be have a valid assembly representation as is and be a valid address. You're trying to use the address of a variable on the stack as the operand. While this is a valid address, it's not a valid operand. The stack variable itself has a valid representation (something like 8(%rbp)), but the address of the stack variable doesn't. (If it were representable it would be something like 8 + %rbp, but this isn't a legal operand.)

One of the few things that you can take the address of and use as an operand with the "p" constraint is a statically allocated variable. In this case it's a valid assembly operand, as it can be represented as an immediate value (eg. &kernel_stack can be represented as $kernel_stack). It's also a valid address and so satisfies the constraint.

So that's why Linux kernel macro works and you macro doesn't. You're trying to use it with stack variables, while the kernel only uses it with statically allocated variables.

Or at least what looks like a statically allocated variabvle to the compiler. In fact kernel_stack is actually allocated in a special section used for per CPU data. This section doesn't actually exist, instead it's used as a template to create a separate region of memory for each CPU. The offset of kernel_stack in this special section is used as the offset in each per CPU data region to store a separate kernel stack value for each CPU. The FS or GS segment register is used as the base of this region, each CPU using a different address as the base.

So that's why the Linux kernel use inline assembly to access what otherwise looks like a static variable. The macro is used to turn the static variable into a per CPU variable. If you're not trying to do something like this then you probably don't have anything to gain by copying from the kernel macro. You should probably be considering a different way to do what you're trying accomplish.

Now if you're thinking since Linus Torvalds has come with this optimization in the kernel to replace an "m" constraint with a "p" it must be a good idea to do this generally, you should be very aware how fragile this optimization is. What its trying to do is fool GCC into thinking that reference to kernel_stack doesn't actually access memory, so that it won't keep reloading the value every time it changes memory. The danger here is that if kernel_stack does change then the compiler will be fooled, and continue to use the old value. Linus knows when and how the per CPU variables are changed, and so can be confident that the macro is safe when used for its intended purpose in the kernel.

If you want eliminate redundant loads in your own code, I suggest using -fstrict-aliasing and/or the restrict keyword. That way you're not dependant on a fragile and non-portable inline assembly macros.

Wednesday, August 4, 2021
answered 4 Months ago

Dead code optimization is typically done by the linker - the compiler doesn't have the overview. However, the compiler might have eliminated unused static functions (as they have internal linkage).

Therefore, you shouldn't be looking at GCC options, but at ld options. It seems you want --print-gc-sections. However, note that you probably want GCC to put each function in its own section, -ffunction-sections. By default GCC will put all functions in the same section, and ld isn't smart enough to eliminate unused functions - it can only eliminate unused sections.

Friday, August 13, 2021
answered 4 Months ago

GCC use AT/T assembly syntax, while pushad/popad are Intel convention, try this:

Wednesday, August 25, 2021
Jakob Gade
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :