Saturday, September 18, 2010

Linux SPI

My notes while reading about Linux and SPI.

De facto - no standardization like i2c/smbus.
Master/slave like i2c
Faster clock 10MHz vs 400kHz i2c

Full duplex, unlike i2c.
Daisy chaining (in addition to / instead of) wired-AND bus like i2c.
Dedicated chip select signals instead of i2c addressing.

Linux software support only applies to master side, like i2c.

Four clocking modes (?)
- CPOL - clock starts low(0)/high(1)
- CPHA - sample data on leading(0)/trailing(1) edge

- SCK - clock, idle when not used like in i2c
- MOSI: Master Out, Slave In
- MISO: Master In, Slave Out
- nCSx: Chip select(s)
- device specific signals

Generally no interoperability above bit-level
- word lengths vary.
- endianness varies
- no discovery or enumeration mechanism

- MicroWire: half duplex
- Synchronous serial protocl SSP
- Programmable serial protocol PSP
- three wire SPI with single data signal MOMI/SISO

SPI slave chips
- DataFlash <> driver name
- probed based spi_board_info associated to spi_master

spi_board_info - declares a slave chip present on the board
- driver name
- speed
- bus number, to match spi_master bus number
- chip select
- controller data, spi_master driver specific extra data

- spi_register_master()
- typically registered by a platform driver, probed for platform device
- transfer() - asynchronously queue job only - no sleeping
- needs to invoke complete callback - can be a context with no sleeping

- atomic series of spi_transfer segments
- chip select
- typically activated before first transfer, deactivated after last
- can flash inactive between individual xfers
- can leave cs active after the laster xfer in a message

- read and/or write @ bits per word, speed, delay after xfer
- native CPU order words in buffers
- words right justified, unused bits in MSB
- buffer length always specified in bytes

- allocates a message and a number of transfers, buggy?

- low level message transfer

- synchronous, sleeping message with transfers

spi_device.write() &
- single half duplex synchronous transfer

spi.device.w8r8 / w8r16, common synchronous simple read/write transfer sequences

bitbang spi

- implement bitbang.chipselect()
- devices with multi word fifo: implement bitbang.txrx_bufs()
- device with single word shift register: implement bitbang.txrx_word[4]()
- device with GPIO: implement setsck(), setmosi(), getmiso()
- device with a gpio layer implementation
- declare master gpios clk,miso,mosi using spi_gio platform data
- declare chipselect gpios by shoehorning them into spi_board_info controller data.
- no need to implement chipselect()

Friday, August 20, 2010

Filter selection through pipe in Emacs

Emacs has this wonderful feature of filtering a selection through a unix pipe. It's just a bit hard to remember how to invoke.

C-u M-| command RET

Thursday, August 19, 2010

RPM cheat sheet

list installed packages, dpkg -l
rpm -qa

info on an installed package, dpkg -s
rpm -qi

contents of an installed package, dpkg -L
rpm -q -l

package where a file belongs to, dpkg -S
rpm -q -f

extract package contents without installing, dpkg -x
rpm2cpio file.rpm | cpio -idv

Sunday, August 8, 2010

debugging makefile variables

To debug makefile variable assigments. Use make -p (--print-data-base) to see which values get assigned to variables and where.

Need to add to reading list: Managing projects with GNU make.

Wednesday, August 4, 2010

OpenSSH host key signature fingerprint

Note to self. It's ssh-keygen that can dump the host key fingerprint in a format suitable for PuTTY security alert comparisons.

ssh-keygen -l -f /etc/ssh/

Thursday, June 24, 2010

Linux i2c

My notes while reading Linux kernel i2c source code and the documentation.

  • implements the character devices for userspace, a generic, "remote controlled" i2c chip driver
  • ioctl to select slave
  • read()/write() for a subset of smbus operations
  • i2c-tools with library functions
i2c adapter

A host bus adapter instance
  • class - allow probing for slave devices in class: HWMON, TV, DDC, SPD
  • nr - 'id' number of instance
  • algo - operations to implement

exported API
  • i2c_transfer() - sequence of low level i2c operations
  • i2c_smbus_transfer() - sequence of high level smbus operations

i2c algorithm

operations to implement a type of host bus adapter: bitbang, philips bus controllers
  • master_xfer() - implement low level i2c transaction segment: operations to read or write bytes to/from i2c device
  • smbus_xfer() - implement high level SMbus access
  • functionality() - query available functionality: SMBUS support, 10bit addressing
master_xfer can be omitted for SMbus only access.
smbus_xfer can be omitted for generic SMbus implementation using low level master_xfer

i2c (chip) drivers

Driver for a type of i2c slave device on an i2c bus.
  • io expanders, power supply supervisors, light sensor, clock
driver operations
  • probe() - gets a client handle and id of device which is supposed to be present.
  • remove()
  • detect() - autodetect which type of device is present: fill in the name in board info.
  • ioctl
  • suspend/resume/shutdown

communicates with the slave device through i2c client API. Prefer using SMbus operations, wider support in bus adapters

i2c client

an instance of a slave device
  • connected to a certain adapter (bus)
  • at a certain address (?)
  • controlled by a driver
exported API
  • i2c_master[send|recv]() - individual low level i2c byte level read/write
  • i2c_smbus_[read|write]_byte_data() - individual high level smbus read/write

i2c board info

declaration of a slave device on a bus. i2c slave devices are not enumerated on the bus, layout must be known.

Alternative methods of setting up i2c slave devices

statically at arch_initcall time for platform
  • build an array to declare types of chips and their addresses
  • i2c_register_board_info()
dynamically at parent device init()
  • i2c_new_device()
  • i2c_new_probed_device() - probe a list of slave addresses if there's a device present
  • i2c_unregister_device()
run time detection from a haystack
  • implement detect() - by reading registers from the device to see if it's really there
  • automatically attempted when bus and driver classes match
  • should be avoided, slow and unsafe


a subset of i2c. "pseudocode" signatures for smbus operations:

void quick_command(u1 cmd)

u8 read_byte()
void write_byte(u8 data)

u8 read_byte_data(u8 addr)
void write_byte_data(u8 addr, u8 data)

u16 read_word_data(u8 addr)
void write_word_data(u8 addr, u16 data)

u8[n] read_block_data(u8 addr, u8 length)
void write_block_data(u8 addr, u8 length, u8 data[n])

smbus i2c extensions with no explicit count byte

u8[n] read_i2c_block_data(u8 addr, int n)
void write_i2c_block_data(u8 addr, u8 data[n], int n)

not implemented

u16 process_call(u8 addr, u16 input)
u8[n] block_process_call(u8 addr, u8 length, u8 data[n])
void host_notify(u8 addr, u16 data)

two comm byte block ops

u8[n] read_block_data(u16 addr, u8 length)
void write_block_data(u16 addr, u8 length, u8 data[n])

Existing slave chip drivers
Scattered around the drivers tree, mostly outside the i2c subdirectory. Search cross reference for i2c_client usage.
  • gpio extender
  • eeprom
  • hardware monitoring chips
  • video related devices
  • real time clocks
  • led dimmer
  • keypad
  • battery power
  • audio codec
  • miscellaneous

Thursday, June 17, 2010

LDD3 notes: Networking

My notes while reading Linux Device Drivers 3rd edition, network drivers.

General and setup

Network devices have no /dev entry point
  • different namespace
  • file operations don't make sense on network interface. Why? I think they could.

alloc_netdev variants alloc_[eth|fc|fddi|tr]dev
  • separate [ltalk|fc|fddi|hippi|tr]_setup
private data not a pointer to driver allocated data, but allocated along with the net_device.
  • supply size to allocate_netdev()
  • use netdev_priv(dev) to access the data
interface flags IFF_*
  • flags & IFF_DEBUG, enable debuggin via ioctl: netifdebug
  • IFF_UP change -> open()/stop()
  • any flag change -> set_multicast_list()
Features, interesting ones

NETIF_F_NO_CSUM - interface needs no checksums
NETIF_F_HW_CSUM - interface hardware does checksums
  • by default all socket buffers are in low memory
Networking device structure
  • jiffy timestamps for last tx/rx, tx watchdog timeout.


functions to control transmission from the networking system
  • netif_start_queue - at open()
  • netif_stop_queue - at stop() or should hard_start_xmit see insufficient buffers left
  • netif_wake_queue() - at tx_completion: same as start, but kicks the networking system back to work
  • netif_tx_disable() - similar to stop, for outside hard_start_xmit

Book suggests that hard_start_xmit() should free the skb at end. with real (dma capable) hardware, probably better free at tx completion to avoid copying.
Book demonstrates extending short packets on the stack - looks like a bad idea for real hardware.


Example deals with skb's allocated at atomic context for received packets. For real hardware it's probably easier and more efficient to preallocate the skb's and run dma right into them.

  • determine protocol, eth_type_trans()
  • mark ip_summed HW/NONE
  • update stats
  • netif_rx()

Temporary polling mode for high throughput. Bypass interrupt overheads.

  • hardware packet buffering
  • capability to disable only rx interrupt
at rx interrupt
  • disable further rx interrupts
  • netif_rx_schedule()
at poll()
  • loop receiving packets
  • don't exceed CPU packet budget, device packet quota
  • netif_receive_skb() instead of netif_rx()
  • netif_rx_complete() and return 0 if no more packets left
  • return 1 if there were packets left
Bypassess input_pkt_queue?

Link state
  • netif_carrier_[on|off|ok]()

  • networking layer by default reserves headroom it needs, at least 32 octets
  • drivers should reserve headroom to have IP header on aligned address (NET_IP_ALIGN)

head data tail end

| head | | tail |
| room | | room |

-----> ---->
pull put


-----> reserve ---->

Hardware address resolution

  • details of the physical layer header are to be encapsulated in the driver
  • ethernet-specific header has a common implementation via ether_setup()
  • neighbour -mechanism used to implement arp, not described in the book?

  • ioctl on a socket invokes protocol specific ioctl()
  • protocol delegates unknown ioctls to device, based on ifreq.ifrn_name
  • do_ioctl() with ifreq in kernel-space, cmd

MII support
book describes an obsolete interface. write an overview of the current code?

  • not to be confused with the NAPI polling interface
  • bootloader-like interruptless operation adapted through polling into software simulated interrupts

Net namespaces
not describe in the book, run across while reading vlan code
per namespace 'global' variables
[un]register_pernet_gen_device() to have a 'global variable' pointer identified by 'id'
net_assign_generic() to set pointer
net_generic() to get pointer

Saturday, June 12, 2010

Free UML tools

Excellent summary of freely available UML tools, with screenshots.

But it's missing the one I'm using the most:

Thursday, June 10, 2010

LDD3 notes: TTY

My notes while reading Linux Device Drivers 3rd edition, TTY drivers.

TTY drivers are a generic implementation of a serial port interface with loose coupling of hardware access and data formatting.

It is a char device composed of
  • tty core for user space char device interface
  • line discipline for data formatting
  • tty driver for hardware access

  • can be called from interrupt context.
  • must succeed for single byte write
write buffer treatment
  • put_char, flush_chars - start sending them, return right away
  • wait_until_sent - start sending, wait until done or timeout
  • flush_buffer - discard write buffered chars.


tty_insert_flip_char() for each char

  • whenever count reaches size
  • at the end of sequence

  • 38400,8n1 -style serial settings
  • handshake mechanism selection

  • access control lines
  • tty driver first, except break
  • fallback to inherit core functionality with -ENOICTLCMD
  • Get Line Status Register LSR: data, overrun, parity, framing, break...
  • Wait on MSR: change in CTS,DSR, ring, carrier..

  • rx flow control

  • tx enable/disable
write_wait queue - tty driver wakes up

tty open/close count example? what happens to tty filps on fork?

Tuesday, June 8, 2010

iMovie event library on a network drive, NAS

I was shocked to find that iMovie '09 would not allow storing event libraries on networked disks. The AFP volume would show up, but remain non-functional with a yellow exclamation sign.

There are plenty of instructions for hacking around the limitation with clumsy symbolic links. I discovered a much simpler approach, using an undocumented (?) setting built right into iMovie:"Allow Network Volumes".

Use at your own risk, try the following command in Terminal

defaults write -app iMovie allowNV -bool true

Mac Time Machine with Linux server

Finally, an easy way to have Time Machine accept a Linux server as a backup disk.

  • Upgrade to Ubuntu 10.4.
  • Install netatalk
  • Add option "tm" to a share in AppleVolumes.default config file.
No need to mess with
  • Manually created sparsebundles
  • Shoehorn settings like TMShowUnsupportedNetworkVolumes
The key is to have netatalk version 2.0.5 or above.

Thursday, June 3, 2010

LDD3 notes: DMA

My notes while reading Linux Device Drivers 3rd edition, Memory mapping and DMA.

Kernel addressing
  • Physical address
  • Logical Address
  • Virtual address
  • Bus address (physical address as understood by the DMA peripheral)
__pa() , __va() for low memory only


PFN - page frame number. (virtual or physical) address / page size

struct page
  • 'handle' to a page of memory.
  • page_address() macro to get virtual address, if mapped
  • kmap() kernel virtual address for any page in the system
  • may sleep, but has an atomic version

VMA - virtual memory area. a contiguous, homogeneous region of process virtual address space.
  • visible in /proc/pid/maps
  • nopage() - process accessed a missing page, go get it
  • vm_pgoff, offset in pages into the backing file
mm_struct - a list of virtual memory areas for a single process. threads may share one.

implementing mmap
  • remap_pfn_range(), once and for all ( io_remap_page_range() )
  • - reserved pages only: locked in memory, outside of memory management
  • implement vm_ops->nopage() and map page at a time to fill the holes
  • - can map regular ram pages
generic dma layer
  • encapsulates cache coherency, bus addressing and implements bounce buffering as a fallback
  • device.dma_set_mask() - how many address bits the device sees
  • cache (in)coherency: cpu won't see dma ram writes, dma ram reads won't see cpu writes
  • buffer allocation and bus addressing: dma_addr_t
  • "one shot" operation streaming mappings - preferred
  • dma_[un]map_single()
  • long standing coherent mappings
  • dma_{alloc|free}_coherent()
  • dma_pool_{create|alloc|free|destroy}()

scullp_vma_open, mapping count non-atomic?
vsyscall page?

Thursday, May 27, 2010

LDD3 notes: Driver Model

My notes while reading Linux Device Drivers 3rd edition.

2.6 device model addresses aspects common to all types of devices:
  • Power management
  • Userspace communications /sysfs
  • Hotplugging w/ udev
  • Device classes
  • Object lifecycle

Kobject, Kset, Subsystem
  • kobject per object, belongs to a ktype.
  • kset for a collection of kobjects of same type
  • Subsystem as a high-level portion of the kernel as a whole: block, devices, usb, pci
  • Class as high-level view of what a device does
  • embed struct kobject in your structure
  • initialize name, ktype, kset and parent
  • implement a destructor (release)
  • implement reference counting by wrapping kobject_{get|put}
  • back-cast kobject references to your structure using using container_of()
Adding a Kobject to a Kset
  • point kobject.kset to set
  • point kobject.parent to the set
  • kobject_register()
  • a kset with a semaphore
  • rooted at the top of the sysfs hierarchy
  • subsystem_{init|register|unregister|get|put}

Bus, Device, Driver, Class, Class device
  • Bus as a channel between processor and devices: physical or virtual
  • Device as an instance to be controlled by a driver, connected to a bus
  • Driver
Bus type
  • bus_[un]register() - confusing name
  • match - compare whether a given device can be handled by the given driver
  • typically implement bus type specific functions to register devices to a specific bus instance
  • on bus enumeration
  • on platform initialization
Bus instance
  • each instance is a device
  • bus_id naming the bus instance
  • parent and/or bus could be null for a singleton top level bus, parent could be the host controller
  • parent pointers reflect a tree of device attachments through buses and host controllers
  • bus_id for unique identification within a bus
  • type of bus, for matching devices and drivers
  • driver and driver private data
Class & Class device
  • Typically handled by high-level code when you register some facilities you implement
  • Device membership in a class is represented by a struct class_device


Kobject Attributes, low-level interface
  • struct attribute: owner module and filesystem mode.
  • sysfs_ops: show/store
  • sysfs_create[_bin]_[file|link]
Attributes for high-level objects
  • [bus|driver|device]_create_file
  • BUS_ATTR(name,mode,show,store)


  • request_firmware(name)
  • release_firmware()
Userspace reacts to hotplug and special sysfs files
  • loading state
  • binary data
  • device identification
Probably implemented in udev?

Thursday, May 20, 2010

LDD3 notes: USB

My notes while reading Linux Device Drivers 3rd edition.


  • naming: root_hub-hub_port[-hub_port]:config.interface
  • write to bConfigurationValue will switch configuration & bound driver


  • skbuff analogy
  • queued, async, acknowledged
  • usb_alloc_urb(), usb_fill_[int|bulk]_urb(), usb_submit_urb()
  • transfer buffers must be kmalloc():ed, not static or automatic

device_id's can match

  • specific vendor/product
  • whole device class / subclass /protocol
  • individual interface class

MODULE_DEVICE_TABLE(usb, usb_device_id table)

Wednesday, May 19, 2010

vmware player network configuration

Vmware player 3.0.1 is a surprisingly capable and free virtualization environment. "Player" in the name is a bit misleading as many of the restrictions in earlier versions have been lifted.

However, network configuration tool is not installed by default and needs to be manually extracted.

Richard at f5 devcentral
provides the details:

  • You can install it manually by extracting it from the install file.
  • VMware-player-3.0.1-227600.exe /e .\vmwareextract
  • Open and copy vmnetcfg.exe to the root of your player install. You now have a nice graphical editor for your virtual network.

Thursday, May 13, 2010

LDD3 notes: Data types

My notes while reading Linux Device Drivers 3rd edition.

size_t vs ssize_t for negative values as error codes

C99 [u]intptr_t, integer which would fit a pointer. Not used in Linux. Similar to eCos cyg_addrword.

fixed size integers

  • u8..s32 within kernel
  • ioctl structures shared with userspace use prefix __u8

_t Typedefs

  • for portability only
  • out of fashion
  • inconvenient printk use

Important constants: HZ. PAGE_SIZE/SHIFT


  • cpu_to_le32() .. be64_to_cpu()


  • force struct to match hardware expectations, without padding for performance: __attribute__ ((packed))

Error codes in pointers

  • ERR_PTR() -> IS_ERR() -> PTR_ERR()


  • list_del_init() - re-init for future re-insertion
  • list_for_each_entry() - foreach and list_entry combined

Thursday, May 6, 2010

LDD3 notes: Memory

My notes while reading Linux Device Drivers 3rd edition.

Allocation Flags

- GFP_KERNEL - system call, i.e. behalf of a process running in kernel -> may sleep

- GFP_ATOMIC - won't sleep, but has limited memory available

Slab caches for quick allocation of constant size objects ( == object pool )

- a.k.a lookaside cache

- ctor/dtor can be called in atomic context, indicated by a flag


- a wrapper for slab cache which keeps some free memory ready

- try to avoid


  • contiguous region of virtual memory
  • non-hardware related work only, like loading modules
  • arbitrary size
  • GFP_KERNEL internally, cannot use in atomic context
  • limited address space available, avoid.


  • arbitrary size
  • contiguous region of physical memory
  • with 1:1 mapped virtual addresses

get_free_page[s]() & get_order()

  • power of 2 number of pages
  • more than 1 is failure prone
  • with 1:1 mapped virtual addresses


  • indirect access through struct page


  • good for counters: percpu_counter.h
  • how to iterate over all cpus?


  • i/o ports in separate i/o address space
  • inb() ... outsl()
  • or adapt them for memory mapped i/o using ioport_map()

request_mem_region() & ioremap()

  • memory mapped i/o
  • access through ioread8()/iowrite32_rep() ( == legacy readb() )

Chapter 8 recommends obsolete readb()? :"Rather, you should always use readb and the other I/O functions introduced in Chapter 9."

Tuesday, May 4, 2010

Insert a picture by reference in MS Word

I've always disliked manual, repetitive tasks with MS Word. There is one less to worry about, now that I've found about inserting and linking pictures.

In insert picture file dialog, select "Insert and Link", to
  • make a link referencing the original file to keep it automatically updated when the picture changes.
  • have an embedded copy in the document to avoid broken links when sending out the Word document.
How to Insert Images in a Word Document without Embedding

Thursday, April 29, 2010

LDD3 notes: Passage of time

My notes while reading Linux Device Drivers 3rd edition


  • get_jiffies_64()
  • comparison macros: time_after/before
  • jiffies <-> timeval / timespec
  • get_cycles()

short busy waits

  • n/u/mdelay()

1ms/1s resolution sleeps

  • msleep(), ssleep()

Working with 1/HZ resolution sleeps

  • wait_event_timeout() - returns timeout left, never negative
  • set_current_state(interruptible), schedule_timeout()
  • in_atomic(), in_interrupt()

Sleeping with waitqueues

  • macro: one init_waitqueue_head, multiple wait_event(_interruptible), one wake_up to wake up them all.
  • manual wait_event alternative: prepare_to_wait(), schedule(), finish_wait(), signal_pending()
  • 'exclusive' sleepers are woken up as specified batches or individually, they won't behave like a herd
  • going to schedule soon, use wake_up_onterruptible_sync, it won't reschedule right away
  • never use sleep_on, it's broken w/ race condition


  • atomic context in timer interrupt handler or softirq
  • multiple schedules result in single execution
  • cannot assume process context or access user space memory


  • queue has a kernel process, can sleep, but cannot access user space
  • each work_struct in queue only once at a time
  • can be delayed for jiffies

Monday, April 26, 2010


Reading a few LWN articles on RCUs (1 2 3) really shed some light on their properties and use.

On read-intensive scenarios, it's more efficient to replace read/write locking with an RCU. This is possible because all Linux platforms have atomic pointer read/write operations.

Read critical section
  • dereferences pointers through a mechanism with platform specific memory ordering guarantees: rcu_dereference()
  • may not sleep
  • may not keep or pass dereferenced pointers outside the critical section
Write critical section
  • Protects agains concurrent writes using regular spinlock mutex
  • Makes a copy of the original structure, making updates in the copy
  • Swaps in the new version, while still keeping the old version (atomic)
  • Invokes synchronize_rcu() to wait for all readers to exit their current rcu read critical sections.
  • ( A non-pre-emptible kernel can simpy wait for all CPUs to switch contexts. )
  • Free the old version
Properly used RCUs are immune to deadlocks, but synchronous writes may be very slow due to need to wait for the context switches.

Thursday, April 22, 2010

LDD3 notes: Concurrency

My notes while reading Linux Device Drivers 3rd edition.

Reasons to pay attention to concurrency

  • Early kernels had no SMP, no pre-emption -> enough to protect from interrupts
  • SMP and pre-emption both pose similar concurrency requirements, even though you'd be willing to ignore the other one.
  • shared resources -> avoid

semaphores: sema_init, up, down (declare_mutex)

  • read/write semaphore pairs: init_rwsem, [up|down]_[read|write], downgrade_write, trylock
  • semaphores are dangerous as automatic variables


  • init_completion, wait_for_completion (uninterruptible), complete, complete_all, complete_and_exit (thread)


  • higher performance than semaphores
  • disables pre-emption on current cpu
  • may not sleep while holding one
  • mutual exclusion with interrupts ok with spin_lock_irqsave, within one function

R/W spinlocks

Lockless data structures: kfifo generic circular buffer

atomic_t, an 24bit integer

atomic bit operations set_bit, clear_bit, change_bit, test_bit, test_and_set

seqlock: data structure versioning and retry on collision


SGI lockmeter to measure time spent waiting on locks

Thursday, April 15, 2010

LDD3 notes: Debugging

My notes while reading Linux Device Drivers 3rd edition.

printk_ratelimit() tells if we're not flooding the log

Kernel configuration

  • CONFIG_DEBUG_SLAB -> canary killed
  • CONFIG_DEBUG_SPINLOCK_SLEEP -> potential sleeps with spinlocks detected

seq_file - kernel in-memory file buffer, similar to open_memstream or ostringstream.

  • a cleaner interface for implementing a /proc file
  • iterator/visitor: start, show, next, stop
  • api for the visitor: seq_printf into a seq_file
  • fops implemented for reading

debugging a live kernel through a 'dynamic' core dump

  • gdb vmlinux /proc/kcore, core-file /proc/kcore
  • cannot modify data, breakpoint, watchpoint or single-step
  • add-symbol-file for modules, using /sys/module/*/sections/.* like with a jtag emulator
  • print *(address)

kdb -debugging from SGI

  • ia32 only
  • built-in to kernel, pause/break key as 'ctrl-c' takes you into debugger
  • has breakpoints and can modify data
  • sees module symbols automatically

kgdb -mm -variant

  • serial port, ethernet kgdboe
  • x86, ppc.
  • no ARM

kgdb variant

  • serial port only
  • x86, ppc.
  • no ARM

Linux trace toolkit


  • vs kprobes?

Thursday, April 8, 2010

LDD3 notes: Device registration and operations

My notes while reading Linux Device Drivers 3rd edition.

register_chrdrv replaced with cdev_add(), cdev_del() and struct cdev

register_chrdev_region, alloc_chrdev_region, unregister_chrdev_region to

  • statically, dynamically pick a contiguous block of dev_t's i.e. major and minor numbers.
  • assign them a device name ( /dev/devices and sysfs )
  • no need to know major/minor numbers at open
  • look at inode->cdev,
  • deduce filp->private_data using using container_of

Notes on proper behaviour with open/close

  • if the device cannot seek: nonseekable_open() , no_llseek
  • struct file represents an open file descriptor in kernel, can be shared by multiple processes
  • one open, fork&dup, single struct file, multiple close, one release

Notes on proper behaviour with read/write, Select(BSD)/Poll(SystemV)/Epoll(Linux)

  • O_NONBLOCK ( == O_NDELAY) and no progress possible -> immediate -EAGAIN
  • _interruptible fails -> -ERESTARTSYS, VFS will retry or return -EINTR
  • poll reports device writable -> next write must not block
  • encountering error in the middle of successfull transfer: return partial result, next attempt will return failure
  • security implications of blindly referencing an user pointer


  • asm/ioctl.h + Documentation/ioctl-number.txt
  • "clueless" legacy: 8bit magic + 8bit device specific
  • capable() - permissions
  • access_ok() - plausible user memory address

Asyncronous notifications from user space perspective, on sockets and tty's

  • whom to notify: F_SETOWN
  • please notify: F_SETFL(FASYNC)
  • receive SIGIO -> select()

Thursday, April 1, 2010

LDD3 notes: Compiling and loading modules

My notes while reading Linux Device Drivers 3rd edition.

out of tree module makefile boilerplate with dual purpose

  • standalone: invokes kernel tree modules target
  • referenced by M: acts like in-tree kbuild makefile

disposable sections with __init, __initdata, __exit

module loading races

  • register facilities only when really ready to take calls
  • at failure, previously registered facilities can be in use already


  • automatically exposed in /sys/module
  • with given access permissions, can be read/written to
  • module won't be notified of writes

Thursday, March 25, 2010

Shared libraries and ELF

I just finished reading How to write Shared Libraries by Ulrich Drepper, a very nice guide to understanding the details of dynamic linking and use of ELF format with Linux.
  • ELF structure like program header and segments
  • Clear explanation of GOT and PLT with concrete examples
  • Types of relocations: relative, symbol relocation and evils of TEXTREL
  • Tracing dynamic linking w/ LD_DEBUG
  • concept of gnu-hash style and backwards compatibility with -hash-style=both
  • attributes constuctor, destructor
  • common variables and -fno-common
  • controlling exported symbols with static / -fvisibility / attribute / #pragma visibility / export maps
  • symbol versioning: implementing backwards compatibility with a single dso: export maps and symver foo@VERS1 for past versions / symver foo@@VERS2 for the latest
  • dangers of depending on unfined symbols and -Wl,-z,defs
  • deprecated rpath vs runpath
  • relative paths to dso:s using dynamic string tokens like $ORIGIN
  • --as-needed to prune the list of DSO in DT_NEEDED

Wednesday, March 24, 2010

gnu binutils

I decided to read through the man pages of binutils, to see if there's something interesting I've never come across. The only things that were not previously very familiar were the c++filt and addr2line tools. Anyway, here are the notes I made while reading about each of the tools in the binutils package.

ar and ranlib
manage archives: static libraries of compiled object files. Most often these are used indirectly through libtool and automake.

Lists symbols from object files.
This is useful when troubleshooting linking problems. With nm you can check which data and functions are actually found in each intermediate object or library.

Copies and converts object files.
Typical use of objcopy is to convert and ELF file into a binary file. This is effectively pre-computing the work of an elf loader for targets where elf loader is not relevant at run time, like ROM code.
Objcopy can also work the other way. Objcopy can wrap a binary file in an elf object, so that binary data can be embedded into application during linking. However, this is ofthen done by dumping the binary as a C source (xxd -i) and then compiling it into an object, which is often easier and more portable.
Objcopy can also transform objects within a single format. It can strip sections or transform symbol names (--prefix-symbols=string).

Displays the contents of an object file.
Objdump (-d) can disassemble an object, to take a look what the compiler actually produced.
Objdump (-t) lists the symbols and their sizes in an ELF file. Sort this list and see where to focus your size optimization efforts first.

displays the text, data, bss segment sizes of an object file. A good sanity check if working with limited memory target.

Displays anything that resembles text. Good for first look at foreign binaries.

Slims down an object file by removing all that is not necessary for loading and relocating the binary. Symbols and debugging information are removed by default. Manually it is possible to remove any specific sections.

Interprets c++ mangled symbols. Individual command line arguments can be demangled, or c++filt can filter symbols found in the middle of a stdin/out stream.

Traces addresses in a binary back to source files, functions and line numbers. Optionally demangling symbols like c++filt.

Interprets the tructure of an ELF binary file.
-h shows the file header: for which machine the file is for
-l shows the segments/program headers: what to load into memory
-S shows the sections
-s shows the symbols

I decided to skip the windows-specific tools. Instead, I should some day match these up with Mach-O specific tools.

Monday, March 22, 2010

Dangers of C library

I just read through Ulrich Drepper's Defensive Programming article. He highlights dangers of C library, provides safer alternatives for commonly used idioms and introduces a number of tools for pinpointing problems in code.

These are my rather unstructured notes of things most interesting to me.

Syntax for forward declarations within paramer list and array size annotation for future checks
int addvec(int n; int arr[n], int n)

tools for formatting an arbitrarily long string
aprintf() which allocates the buffer, for constant format string only
FILE *open_memstream(), for an ostringstream -like in-memory file

fast string handling within stack
strdupa() and alloca()

parsing strings of arbitrary size
%as format string with buffer allocated by scanf

parsing simple delimited files with arbitrary length of lines
ssize_t getdelim()
ssize_t getline()

practical examples of using file descriptor variants of regular functions for avoiding race conditions
eg. chown -> fchown

O_NOFOLLOW for avoiding dereferencing symbolic links

mapping a function over a directory tree with ntfw()

create temporary files only with mkstemp() or tmpfile()

unix domain socket authentication

proper randomness
  • initstate() from /dev/random
  • random()
_FORTIFY_SOURCE for compiling in runtime memory checks around standard C runtime.

use of __attribute__ __deprecated__ and concept of .gnu.warning -sections within ELF.

good overview of tools

  • forces unitialized memory to contain given garbage
  • output a trace file with malloc and free operations for single threaded programs
  • replaces malloc
electric fence
  • inserts protected pages around allocated memory, over freed memory
  • virtual machine for IA-32 only
  • gcc4 inserted instrumentation and library for memory checking
catchsegv / libsegfault
  • dump backtrace on crash
_GLIBCXX_DEBUG - templates with assertions

Thursday, March 18, 2010

Booting Linux over JTAG

Note to self. A working gdb script to boot Linux kernel on an atmel board over JTAG.

Unfortunately this is not completely automatic. u-boot needs to be interrupted for loading the uImage through gdb. U-boot can then be resumed and kernel booted with bootm 0x22000000.

Also, either the emulator needs to be configured to ignore data aborts or kernel futex support to be disabled.

target remote emulator:2001

monitor reset halt

define bootstrap
file /path/to/at91sam9260ek/at91sam9260ek-nandflashboot-2.11-r1.elf
j *0x200000 #bootstrap entry point in sram

define u-boot
file /path/to/u-boot/u-boot
j *0x23f00000 # u-boot entry point in sdram

define linux
restore /path/to/arch/arm/boot/uImage binary 0x22000000
file /path/to/vmlinux

Sunday, March 14, 2010

Optimizing with gcc

My notes while reading Optimizing applications with gcc & glibc by Ulrich Drepper.
  • __extension__ to mark intentional use of GNU extension.
  • poor mans 'lambda' (({ }))
  • predicate to determine if parameter value is constant at runtime: __builtin_constant_p()
  • mark pure functions with __attribute__((__const__))
  • mark functions which never return with __attribute__((__noreturn__))
  • __stdcall__ attribute on x86 makes callee correct the stack at ret, in Unix convention the caller corrects the stack
  • mempcpy returns a pointer just after the last copied byte.
  • calloc knows if memory is already full of 0's -> faster than malloc & memset
  • int_least16_t, int_fast16_t when actual variable size is uninteresting, speed is
  • replace stlen and + with strchr(s,'\0') or rawmemchr()
  • gcc computed gotos and jump tables: && for the address of a label
  • -pg and -profile : gprof and sprof

Saturday, March 13, 2010

Eclipse CDT

I'll have to try out the Eclipse CDT sometime
- integration of native development tools such as the GNU Autotools: Linux Tools
- a plugin to explore remote system files through Eclipse: eclipse-rse

Wednesday, March 10, 2010

64bit PC's and memory

Just to refresh my knowledge on the PC platforms, I read the excellent Wikipedia article on the 64-bit x86.

PC processors and 64-bitness

The current 64-bit instruction set is a backwards compatible extension to the Intel instruction set. It is implemented in all typical processors from all common vendors. It has multiple confusing names, all meaning the same: Intel 64, amd64, EM64T, x86_64, x86-64, IA-32e.

Intel Itanium (IA-64) is a completely different thing, and has nothing to do with current 64bit systems.

Not all of the 64 bits are in use today. Some of them are cleverly saved for the future by dividing the virtual address space into two halves. By having the unused gap in the middle as a forbidden zone, the address bits reserved for future use are safe from short term abuse by implementations.

Maximum physical memory in a PC

Having more than 4GB of physical memory is not an issue for 64-bit systems. Even a single process can use more than 4GB.

Using more than 4GB of physical memory has been possible for 32-bit systems as well since Pentium Pro ('90s). Physical Address Extension alters the page table structure to support this. Without PAE, each page table has 1024 32-bit physical addresses. With PAE, a page table has 512 64-bit addresses, even though only 36 will be used. A single process still cannot use more than 4GB, but you could have multiple processes, having mapped up to 4GB physical RAM each.

Monday, March 8, 2010

Linux Kernel in a nutshell

Although I've been working with Linux kernel for ages, I decided to browse through Linux Kernel in a nutshell, an excellent book on the practical things regarding working with the kernel. There certainly were things that I was not aware of. Here are my notes on these things.

Using a read-only source tree with O=/path/to/build/output

Some basic static analysis with

  • sparse with C=1 or C=2.
  • checkstack
  • namespacecheck

Localversion-prefixed files in object and source trees.

Finding all kernel modules required by the running system, using modaliases in the /sys tree and modprobe with --show-depends. Or helper script.

Kernel argument max_addr to force a upper bound on physical addresses used. Probably a good for validating a software image for a low cost board version. I've used mem=XX for this before, but I believe max_addr could yield even more realistic results.

The book included nice "demos" for tools like

  • quilt for managing a set of patches against an upstream source tree
  • ketchup as a frontend to upstream kernel archives.

These were not described in detail, but are interesting and I need to remember to read more about them later.

  • RCU's
  • format of modaliases
  • seccomp for sandboxing processes
  • kernel SPI support
  • inotify mechanism
  • oprofile tooling
  • kernel security models.

Saturday, January 23, 2010

USB logic analyzer from Saleae

Yesterday I received my Logic, an affordable USB-attached logic analyzer.
The software for Mac OS X is in early stages with very basic capture functionality only. But it already helped me troubleshoot one system.

I've used its Windows software in the past, which is much better. I hope the features like integrated protocol analyzers will be soon be part of the Mac package as well.