法律聲明:《linux 3.4.10 內(nèi)核內(nèi)存管理源代碼分析》系列文章由陳晉飛(ancjf@163.com)發(fā)表于http://blog.csdn.net/ancjf,文章遵循GPL協(xié)議。歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)注明作者和此條款。
從slab中分配出去的內(nèi)存實(shí)際都是slab從伙伴系統(tǒng)申請(qǐng)一塊內(nèi)存,然后分割成若干小塊,再分配出去。一個(gè)slab塊被劃分為長(zhǎng)度相等的若干小塊,第零個(gè)小塊的首地址保存在struct slab的成員s_mem中,每個(gè)小塊都有個(gè)編號(hào),在一個(gè)slab塊中是唯一的。一塊從伙伴系統(tǒng)申請(qǐng)的內(nèi)存塊用structslab描述,一塊slab的控制數(shù)據(jù)并不完全是保存在struct slab中,因?yàn)檫€需要一些數(shù)據(jù)來保存空閑小塊的信息,不同的slab緩存中的slab塊可能包含小塊的數(shù)量不一樣。空閑小塊實(shí)際是以單向編號(hào)鏈表的方式管理的,對(duì)每個(gè)小塊有一個(gè)編號(hào),每個(gè)編號(hào)在編號(hào)鏈表中有一項(xiàng)用來保存指向下一項(xiàng)的編號(hào)。當(dāng)分配內(nèi)存時(shí)從鏈表頭取下一項(xiàng),但釋放內(nèi)存時(shí)把釋放項(xiàng)加入鏈表頭??臻e鏈表總是保存在struct slab結(jié)構(gòu)之后。下圖是一個(gè)包含6個(gè)小塊,3個(gè)空閑小塊的slab塊的控制數(shù)據(jù)示例圖,最前面方格保存struct slab結(jié)構(gòu),后面是slab空閑編號(hào)鏈表。
獲取Slab塊的空閑編號(hào)鏈表的地址的函數(shù)是slab_bufctl,在mm/slab.c中實(shí)現(xiàn),代碼如下:
2804 static inline kmem_bufctl_t*slab_bufctl(struct slab *slabp)
2805 {
2806 return (kmem_bufctl_t *) (slabp + 1);
2807 }
小塊編號(hào)到小塊的虛擬地址的轉(zhuǎn)換由index_to_obj實(shí)現(xiàn),小塊的虛擬地址到小塊編號(hào)由obj_to_index實(shí)現(xiàn),這兩個(gè)函數(shù)都在mm/slab.c中實(shí)現(xiàn),代碼如下:
532 static inline void *index_to_obj(structkmem_cache *cache, struct slab *slab,
533 unsigned intidx)
534{
535 return slab->s_mem + cache->buffer_size * idx;
536}
537
538/*
539 *We want to avoid an expensive divide : (offset / cache->buffer_size)
540 * Using the fact thatbuffer_size is a constant for a particular cache,
541 * we can replace (offset /cache->buffer_size) by
542 * reciprocal_divide(offset,cache->reciprocal_buffer_size)
543 */
544static inline unsigned int obj_to_index(const struct kmem_cache *cache,
545 conststruct slab *slab, void *obj)
546{
547 u32 offset = (obj - slab->s_mem);
548 return reciprocal_divide(offset, cache->reciprocal_buffer_size);
549}
編號(hào)和地址的轉(zhuǎn)換需要小塊的長(zhǎng)度信息,slab緩存中小塊的長(zhǎng)度保存在structkmem_cache的成員buffer_size中。index_to_obj比較簡(jiǎn)單,下面只說說obj_to_index
reciprocal_buffer_size的計(jì)算方法在mm/slab.c中的kmem_cache_init中
1563 cache_cache.reciprocal_buffer_size =
1564 reciprocal_value(cache_cache.buffer_size);
reciprocal_value的代碼中l(wèi)ib/reciprocal_div.c中,如下:
5u32 reciprocal_value(u32 k)
6 {
7 u64 val = (1LL <<32) + (k - 1);
8 do_div(val, k);
9 return (u32)val;
10 }
reciprocal_divide的代碼在include/linux/reciprocal_div.h中,如下
28 static inline u32 reciprocal_divide(u32A, u32 R)
29 {
30 return (u32)(((u64)A * R) >> 32);
31 }
綜合起來obj_to_index的計(jì)算公式就是(((2^32+(buffer_size-1))/buffer_size)* offset)/2^32
由(((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 <= (((2^32+(buffer_size-1)) * offset)/ buffer_size)/2^32 == (((2^32+(buffer_size-1)) * offset) /2^32) / buffer_size ==( (2^32 * offset + (buffer_size-1) * offset) / 2^32)/ buffer_size= (2^32 * offset) / 2^32 / buffer_size = offset/ buffer_size,這樣就得到了(((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 <= offset/ buffer_size,這個(gè)推導(dǎo)使用了條件(buffer_size-1) * offset < 2^32
另外有offset/ buffer_size <= (((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32
只要offset * 2^32 <= ((2^32+(buffer_size-1))/buffer_size)* offset) * buffer_size
只要offset * 2^32 <= ((2^32+(buffer_size-1))/buffer_size)* buffer_size) * offset
只要2^32 <= ((2^32+(buffer_size-1))/buffer_size)* buffer_size),這個(gè)條件總是滿足的,所以offset/buffer_size <= (((2^32+(buffer_size-1))/ buffer_size)*offset)/2^32也成立。
這樣我們得到了offset/ buffer_size == (((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32
正如obj_to_index所注釋的,obj_to_index的計(jì)算結(jié)果是offset/ buffer_size,這樣實(shí)現(xiàn)只是為了避免使用除法,因?yàn)槌ㄖ噶顖?zhí)行比較慢。
從slab塊分配對(duì)象的函數(shù)是slab_get_obj,在mm/slab.c中實(shí)現(xiàn),代碼如下:
2866static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slabp,
2867 int nodeid)
2868 {
2869 void *objp = index_to_obj(cachep, slabp, slabp->free);
2870 kmem_bufctl_t next;
2871
2872 slabp->inuse++;
2873 next = slab_bufctl(slabp)[slabp->free];
2874 #if DEBUG
2875 slab_bufctl(slabp)[slabp->free] = BUFCTL_FREE;
2876 WARN_ON(slabp->nodeid != nodeid);
2877 #endif
2878 slabp->free = next;
2879
2880 return objp;
2881 }
2869行獲得空閑編號(hào)鏈表頭的指針,2872行更新使用的分配出去的對(duì)象計(jì)數(shù),2873行獲得下一個(gè)空閑編號(hào),2878把下一個(gè)空閑編號(hào)作為鏈表頭保存起來。
釋放小塊內(nèi)存到slab塊的函數(shù)是slab_put_obj,在mm/slab.c中實(shí)現(xiàn),代碼如下:
2883 static void slab_put_obj(structkmem_cache *cachep, struct slab *slabp,
2884 void *objp,int nodeid)
2885 {
2886 unsigned int objnr = obj_to_index(cachep, slabp, objp);
2887
2888 #if DEBUG
2889 /* Verify that the slab belongs to the intended node */
2890 WARN_ON(slabp->nodeid != nodeid);
2891
2892 if (slab_bufctl(slabp)[objnr] + 1 <= SLAB_LIMIT + 1) {
2893 printk(KERN_ERR "slab:double free detected in cache "
2894 "'%s',objp %p\n", cachep->name, objp);
2895 BUG();
2896 }
2897 #endif
2898 slab_bufctl(slabp)[objnr] = slabp->free;
2899 slabp->free = objnr;
2900 slabp->inuse--;
2901 }
2886行求得編號(hào),2898-2899行把新的編號(hào)作為鏈表頭,并行鏈表頭的項(xiàng)指向以前的鏈表頭,2900減少對(duì)象使用計(jì)算。
Slab塊的空閑編號(hào)鏈表的初始化函數(shù)是cache_init_objs,在mm/slab.c中實(shí)現(xiàn),代碼如下:
2809static void cache_init_objs(struct kmem_cache *cachep,
2810 struct slab*slabp)
2811 {
2812 int i;
2813
2814 for (i = 0; i < cachep->num; i++) {
2815 void *objp =index_to_obj(cachep, slabp, i);
2816 #if DEBUG
2817 /* need to poison the objs? */
2818 if (cachep->flags &SLAB_POISON)
2819 poison_obj(cachep,objp, POISON_FREE);
2820 if (cachep->flags& SLAB_STORE_USER)
2821 *dbg_userword(cachep,objp) = NULL;
2822
2823 if (cachep->flags &SLAB_RED_ZONE) {
2824 *dbg_redzone1(cachep,objp) = RED_INACTIVE;
2825 *dbg_redzone2(cachep,objp) = RED_INACTIVE;
2826 }
2827 /*
2828 * Constructors are notallowed to allocate memory from the same
2829 * cache which they are aconstructor for. Otherwise, deadlock.
2830 * They must also be threaded.
2831 */
2832 if (cachep->ctor &&!(cachep->flags & SLAB_POISON))
2833 cachep->ctor(objp +obj_offset(cachep));
2834
2835 if (cachep->flags &SLAB_RED_ZONE) {
2836 if(*dbg_redzone2(cachep, objp) != RED_INACTIVE)
2837 slab_error(cachep, "constructor overwrote the"
2838 " end of an object");
2839 if(*dbg_redzone1(cachep, objp) != RED_INACTIVE)
2840 slab_error(cachep, "constructor overwrote the"
2841 " start of an object");
2842 }
2843 if ((cachep->buffer_size %PAGE_SIZE) == 0 &&
2844 OFF_SLAB(cachep)&& cachep->flags & SLAB_POISON)
2845 kernel_map_pages(virt_to_page(objp),
2846 cachep->buffer_size/ PAGE_SIZE, 0);
2847 #else
2848 if (cachep->ctor)
2849 cachep->ctor(objp);
2850 #endif
2851 slab_bufctl(slabp)[i] = i + 1;
2852 }
2853 slab_bufctl(slabp)[i - 1] = BUFCTL_END;
2854 }
如果不考慮DEBUG宏,cache_init_objs函數(shù)初始化空閑編號(hào)鏈表,并調(diào)用了對(duì)象的構(gòu)造函數(shù)。聯(lián)系客服