豆瓣🔗:

1 Introduction to the Linux Kernel

2 Getting Started with the Kernel

1
2
3
4
5
# git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Cloning into 'linux-2.6'...
remote: Enumerating objects: 4084, done.
...
# cd linux-2.6

This command creates a configuration based on the defaults for your architecture:

1
# make defconfig

bug:

1
2
3
4
5
6
7
8
# make defconfig
make defconfig *** Default configuration is based on 'multi_v7_defconfig' 
ld: unknown option: --version 
ld: unknown linker 
scripts/Kconfig.include:56: Sorry, this linker is not supported. 
make[2]: *** [defconfig] Error 1 
make[1]: *** [defconfig] Error 2 
make: *** [__sub-make] Error 2

solution: Configuring the Linux Kernel from MacOS


1
# make

bug:

1
2
3
4
5
6
7
8
9
# make
  HOSTCC  scripts/sorttable
scripts/sorttable.c:27:10: fatal error: 'elf.h' file not found
#include <elf.h>
         ^~~~~~~
1 error generated.
make[2]: *** [scripts/sorttable] Error 1
make[1]: *** [scripts] Error 2
make: *** [__sub-make] Error 2

bug:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# make
  SYNC    include/generated/autoconf.h
ld: unknown option: --version
ld: unknown linker
  CC      scripts/mod/empty.o
./scripts/check-local-export: line 17: shopt: lastpipe: invalid shell option name
make[2]: *** [scripts/mod/empty.o] Error 1
make[2]: *** Deleting file `scripts/mod/empty.o'
make[1]: *** [prepare0] Error 2
make: *** [__sub-make] Error 2

solution: macOS交叉编译Linux避坑指南

(不知道是哪位大佬的避坑指南,但的确macOS遇到的好几个问题都一样)


A Beast of a Different Nature

  • The kernel has access to neither the C library nor the standard C headers.

  • The kernel is coded in GNU C.

  • The kernel lacks the memory protection afforded to user-space.

  • The kernel cannot easily execute floating-point stack.

  • The kernel has a small per-process fixed-size stack.

  • Because the kernel has asynchronous interrupts, is preemptive, and supports SMP, synchronization and concurrency are major concerns within the kernel.

  • Portability is important.

No libc or Standard Headers

1
2
3
printk("Hello world! A string '%s' and an integer '%d'\n", str, i);

printk(KERN_ERR "this is an error!\n");

Inline Functions

upside:

  • eliminates the overhead of functions invocation and return (register saving and restore)
  • allows for potentially greater optimization as the compiler can optimize both the caller and the called function as one.

downside (nothing in life is free):

  • code size increases because the contents of the function are copied into all the callers, which increase memory consumption and instruction cache footprint (占用空间).

从 “nothing in life is free” 想起了 “There’s no such thing as a free lunch. 天下没有免费的午餐” 和

She was still too young to know that life never gives anything for nothing, and that a price is always exacted for what fate bestows.

她那时候还太年轻,不知道所有命运赠予的礼物,早已在暗中标好了价格。

——[茨威格] 《断头王后》

计算机中“空间换时间”和“时间换空间”不也是同样的道理。Every coin has two sides.


Kernel developers use inline functions for small time-critical functions.

Making large functions inline, especially those used more than once or that are not exceedingly time critical, is frowned upon.

1
2
//The function declaration must precede any usage.
static inline void wolf(unsigned long tail_size)

Inline Assembly

1
2
3
unsigned int low,high;
asm volatile("rdtsc": "=a"(low),"=b"(high));
/* low and high now contain the lower and upper 32-bits of the 64-bit tsc*/

Branch Annotation

1
2
3
4
5
6
7
8
9
if(error){
  /* ... */
}
if(unlikely(error)){
  /* ... */
}
if(likely(error)){
  /* ... */
}

Conclusion

The most important step on the road to Linux development is the realization that the kernel is not something to fear. Unfamiliar, sure. Insurmountable? Not at all.

很喜欢这句💪。

3 Process Management

Process Descriptor and the Task Structure

the task list: circular doubly linked list 循环双向链表

Each element in the task list is a process descriptor of the type struct task_struct.

1
2
3
4
5
6
7
8
9
struct task_struct{
    unsigned long state;
    int prio;
    unsigned long policy;
    struct task_struct *parent;
    struct list_head tasks;
    pid_t pid;
    ...
}

Allocating the Process Descriptor

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
struct thread_info{
    struct task_struct *task;
    struct exec_domain *exec_domain;
    _u32 status;
    _u32 status;
    _u32 cpu;
    int preempt_count;
    mm_segment_t addr_limit;
    struct restart_block restart_block;
    void *sysenter_return;
    int uaccess_err;
}

Process State

  • TASK_RUNNING
  • TASK_INTERRUPTIBLE
  • TASK_UNINTERRUPTIBLE
  • _TASK_TRACED
  • _TASK_STOPPED

Manipulating the Current Process State

1
2
3
4
5
set_task_state(task, state);
task->state=state;

set_current_state(state)
set_task_state(current, state)

The Process Family Tree

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
struct task_stuct *my_parent=current->parent;

struct task_struct *task;
struct list_head *list;
list_for_each(list, &current->children){
  task=list_entry(list, struct task_struct, sibling);
}

struct task_struct *task;
for(task=current; task!=&init_task; task=task->parent)
  
list_entry(task->tasks.next; struct task_struct, tasks)
list_entry(task->tasks.prev; struct task_struct, tasks)
  
struct task_struct *task;
for_each_process(task){
  printk("%s[%d]\n", task->comm, task->pid);
}

The Linux Implementation of Threads

Linux implements all threads as standard processes.

4 Process Scheduling

5 System Calls

6 Kernel Data Structures

Red-Black Trees

Linux’s primary binary tree data structure is the read-black tree.

Red-black trees remain semi-balanced by enforcing that the following six properties remain true:

  1. All nodes are either red or black.
  2. Leaf nodes are black.
  3. Leaf nodes do not contain data.
  4. All non-leaf nodes have two children.
  5. If a node is red, both of its children are black.
  6. The path from a node to one of its leaves contains the same number of black nodes as the shortest path to any of its other leaves.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/*
 * red-black trees properties:  https://en.wikipedia.org/wiki/Rbtree
 *
 *  1) A node is either red or black
 *  2) The root is black
 *  3) All leaves (NULL) are black
 *  4) Both children of every red node are black
 *  5) Every simple path from root to leaves contains the same number
 *     of black nodes.
 *
 *  4 and 5 give the O(log n) guarantee, since 4 implies you cannot have two
 *  consecutive red nodes in a path and every red node is therefore followed by
 *  a black. So if B is the number of black nodes on every simple path (as per
 *  5), then the longest possible path due to 4 is 2B.
 *
 *  We shall indicate color with case, where black nodes are uppercase and red
 *  nodes will be lowercase. Unknown color nodes shall be drawn as red within
 *  parentheses and have some accompanying text comment.
 */

rbtrees

1
struct rb_root root=RB_ROOT;

a search of Linux’s page cache for a chunk of a file (represented by an inode and offset pair).

Each inode has its own rbtree, keyed off of page offsets into file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
struct page * rb_search_page_cache(struct inode *inode, unsigned long offset)
{
    struct rb_node *n=inode->i_rb_page_cache.rb_node;
  
  	while(n){
      	struct page *page=rb_entyry(n, struct page, rb_page_cache);
      	if(offset < page->offset) n=n->rb_left;
      	else if(offset > page->offset) n=n->rb_right;
      	else return page;
    }
  	return NULL;
}

Insert:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
struct page * rb_insert_page_cache(struct inode *inode, unsigned long offset, struct rb_node *node)
{
    struct rb_node *p=&inode->i_rb_page_cache.rb_node;
    struct rb_node *parent=NULL;
    struct page *page;

    while(*p){
      parent=*p;
      page=rb_entry(parent, struct page, rb_page_cache);
      
      if(offset < page->offset) p=&(*p)->rb_left;
      else if(offset > page->offset) p=&(*p)->rb_right;
      else return page;
    }

    rb_link_node(node, parent, p);
    rb_insert_color(node, &inode->i_rb_page_cache);

    return NULL;
}

7 Interrupts and Interrupt Handlers

8 Bottom Halves and Deferring Work

9 An Introduction to Kernel Synchronization

10 Kernel Synchronization Methods

11 Timers and Time Management

12 Memory Management

13 The virtual File system

14 The Block I/O layer

15 The Process Address Space