Thursday, March 25, 2010

Shared libraries and ELF

I just finished reading How to write Shared Libraries by Ulrich Drepper, a very nice guide to understanding the details of dynamic linking and use of ELF format with Linux.
  • ELF structure like program header and segments
  • Clear explanation of GOT and PLT with concrete examples
  • Types of relocations: relative, symbol relocation and evils of TEXTREL
  • Tracing dynamic linking w/ LD_DEBUG
  • concept of gnu-hash style and backwards compatibility with -hash-style=both
  • attributes constuctor, destructor
  • common variables and -fno-common
  • controlling exported symbols with static / -fvisibility / attribute / #pragma visibility / export maps
  • symbol versioning: implementing backwards compatibility with a single dso: export maps and symver foo@VERS1 for past versions / symver foo@@VERS2 for the latest
  • dangers of depending on unfined symbols and -Wl,-z,defs
  • deprecated rpath vs runpath
  • relative paths to dso:s using dynamic string tokens like $ORIGIN
  • --as-needed to prune the list of DSO in DT_NEEDED

Wednesday, March 24, 2010

gnu binutils

I decided to read through the man pages of binutils, to see if there's something interesting I've never come across. The only things that were not previously very familiar were the c++filt and addr2line tools. Anyway, here are the notes I made while reading about each of the tools in the binutils package.

ar and ranlib
manage archives: static libraries of compiled object files. Most often these are used indirectly through libtool and automake.

nm
Lists symbols from object files.
This is useful when troubleshooting linking problems. With nm you can check which data and functions are actually found in each intermediate object or library.

objcopy
Copies and converts object files.
Typical use of objcopy is to convert and ELF file into a binary file. This is effectively pre-computing the work of an elf loader for targets where elf loader is not relevant at run time, like ROM code.
Objcopy can also work the other way. Objcopy can wrap a binary file in an elf object, so that binary data can be embedded into application during linking. However, this is ofthen done by dumping the binary as a C source (xxd -i) and then compiling it into an object, which is often easier and more portable.
Objcopy can also transform objects within a single format. It can strip sections or transform symbol names (--prefix-symbols=string).

objdump
Displays the contents of an object file.
Objdump (-d) can disassemble an object, to take a look what the compiler actually produced.
Objdump (-t) lists the symbols and their sizes in an ELF file. Sort this list and see where to focus your size optimization efforts first.

size
displays the text, data, bss segment sizes of an object file. A good sanity check if working with limited memory target.

strings
Displays anything that resembles text. Good for first look at foreign binaries.

strip
Slims down an object file by removing all that is not necessary for loading and relocating the binary. Symbols and debugging information are removed by default. Manually it is possible to remove any specific sections.

c++filt
Interprets c++ mangled symbols. Individual command line arguments can be demangled, or c++filt can filter symbols found in the middle of a stdin/out stream.

add2line
Traces addresses in a binary back to source files, functions and line numbers. Optionally demangling symbols like c++filt.

readelf
Interprets the tructure of an ELF binary file.
-h shows the file header: for which machine the file is for
-l shows the segments/program headers: what to load into memory
-S shows the sections
-s shows the symbols


I decided to skip the windows-specific tools. Instead, I should some day match these up with Mach-O specific tools.

Monday, March 22, 2010

Dangers of C library

I just read through Ulrich Drepper's Defensive Programming article. He highlights dangers of C library, provides safer alternatives for commonly used idioms and introduces a number of tools for pinpointing problems in code.

These are my rather unstructured notes of things most interesting to me.


Syntax for forward declarations within paramer list and array size annotation for future checks
int addvec(int n; int arr[n], int n)

tools for formatting an arbitrarily long string
aprintf() which allocates the buffer, for constant format string only
FILE *open_memstream(), for an ostringstream -like in-memory file

fast string handling within stack
strdupa() and alloca()

parsing strings of arbitrary size
%as format string with buffer allocated by scanf

parsing simple delimited files with arbitrary length of lines
ssize_t getdelim()
ssize_t getline()

practical examples of using file descriptor variants of regular functions for avoiding race conditions
eg. chown -> fchown

O_NOFOLLOW for avoiding dereferencing symbolic links

mapping a function over a directory tree with ntfw()

create temporary files only with mkstemp() or tmpfile()

unix domain socket authentication
getsockopt(SO_PEERCRED)

proper randomness
  • initstate() from /dev/random
  • random()
_FORTIFY_SOURCE for compiling in runtime memory checks around standard C runtime.

use of __attribute__ __deprecated__ and concept of .gnu.warning -sections within ELF.



good overview of tools

MALLOC_PERTURB
  • forces unitialized memory to contain given garbage
mtrace()
  • output a trace file with malloc and free operations for single threaded programs
mcheck()
dmalloc
  • replaces malloc
electric fence
  • inserts protected pages around allocated memory, over freed memory
valgrind
  • virtual machine for IA-32 only
mudflap
  • gcc4 inserted instrumentation and library for memory checking
catchsegv / libsegfault
  • dump backtrace on crash
_GLIBCXX_DEBUG - templates with assertions

Thursday, March 18, 2010

Booting Linux over JTAG

Note to self. A working gdb script to boot Linux kernel on an atmel board over JTAG.

Unfortunately this is not completely automatic. u-boot needs to be interrupted for loading the uImage through gdb. U-boot can then be resumed and kernel booted with bootm 0x22000000.

Also, either the emulator needs to be configured to ignore data aborts or kernel futex support to be disabled.

target remote emulator:2001

monitor reset halt

define bootstrap
file /path/to/at91sam9260ek/at91sam9260ek-nandflashboot-2.11-r1.elf
load
j *0x200000 #bootstrap entry point in sram
end

define u-boot
file /path/to/u-boot/u-boot
load
j *0x23f00000 # u-boot entry point in sdram
end

define linux
restore /path/to/arch/arm/boot/uImage binary 0x22000000
file /path/to/vmlinux
end

Sunday, March 14, 2010

Optimizing with gcc

My notes while reading Optimizing applications with gcc & glibc by Ulrich Drepper.
  • __extension__ to mark intentional use of GNU extension.
  • poor mans 'lambda' (({ }))
  • predicate to determine if parameter value is constant at runtime: __builtin_constant_p()
  • mark pure functions with __attribute__((__const__))
  • mark functions which never return with __attribute__((__noreturn__))
  • __stdcall__ attribute on x86 makes callee correct the stack at ret, in Unix convention the caller corrects the stack
  • mempcpy returns a pointer just after the last copied byte.
  • calloc knows if memory is already full of 0's -> faster than malloc & memset
  • int_least16_t, int_fast16_t when actual variable size is uninteresting, speed is
  • replace stlen and + with strchr(s,'\0') or rawmemchr()
  • gcc computed gotos and jump tables: && for the address of a label
  • -pg and -profile : gprof and sprof

Saturday, March 13, 2010

Eclipse CDT

I'll have to try out the Eclipse CDT sometime
- integration of native development tools such as the GNU Autotools: Linux Tools
- a plugin to explore remote system files through Eclipse: eclipse-rse

Wednesday, March 10, 2010

64bit PC's and memory

Just to refresh my knowledge on the PC platforms, I read the excellent Wikipedia article on the 64-bit x86.

PC processors and 64-bitness


The current 64-bit instruction set is a backwards compatible extension to the Intel instruction set. It is implemented in all typical processors from all common vendors. It has multiple confusing names, all meaning the same: Intel 64, amd64, EM64T, x86_64, x86-64, IA-32e.

Intel Itanium (IA-64) is a completely different thing, and has nothing to do with current 64bit systems.

Not all of the 64 bits are in use today. Some of them are cleverly saved for the future by dividing the virtual address space into two halves. By having the unused gap in the middle as a forbidden zone, the address bits reserved for future use are safe from short term abuse by implementations.

Maximum physical memory in a PC

Having more than 4GB of physical memory is not an issue for 64-bit systems. Even a single process can use more than 4GB.

Using more than 4GB of physical memory has been possible for 32-bit systems as well since Pentium Pro ('90s). Physical Address Extension alters the page table structure to support this. Without PAE, each page table has 1024 32-bit physical addresses. With PAE, a page table has 512 64-bit addresses, even though only 36 will be used. A single process still cannot use more than 4GB, but you could have multiple processes, having mapped up to 4GB physical RAM each.

Monday, March 8, 2010

Linux Kernel in a nutshell

Although I've been working with Linux kernel for ages, I decided to browse through Linux Kernel in a nutshell, an excellent book on the practical things regarding working with the kernel. There certainly were things that I was not aware of. Here are my notes on these things.


Using a read-only source tree with O=/path/to/build/output

Some basic static analysis with

  • sparse with C=1 or C=2.
  • checkstack
  • namespacecheck

Localversion-prefixed files in object and source trees.


Finding all kernel modules required by the running system, using modaliases in the /sys tree and modprobe with --show-depends. Or get-driver.sh helper script.


Kernel argument max_addr to force a upper bound on physical addresses used. Probably a good for validating a software image for a low cost board version. I've used mem=XX for this before, but I believe max_addr could yield even more realistic results.


The book included nice "demos" for tools like

  • quilt for managing a set of patches against an upstream source tree
  • ketchup as a frontend to upstream kernel archives.


These were not described in detail, but are interesting and I need to remember to read more about them later.

  • RCU's
  • format of modaliases
  • seccomp for sandboxing processes
  • kernel SPI support
  • inotify mechanism
  • oprofile tooling
  • kernel security models.