项目作者: gaeduron

项目描述 :
💾 Memory Allocation - 42 project {Mmap, memory allocation algorithm, Munmap & free}
高级语言: C
项目地址: git://github.com/gaeduron/Malloc.git
创建时间: 2019-02-15T19:00:45Z
项目社区:https://github.com/gaeduron/Malloc

开源协议:

下载


Malloc

Introduction

⚠️THIS REPO IS A WORK IN PROGRESS AT THIS POINT

This is simple implementation of malloc. This malloc is a way to get familiar with memory allocation in C.
If you are looking for a more advanced malloc implementation you can take a look at the glibc implementation.

📖 Summary

  1. Guideline
  2. Data-Structures overview
  3. Data-Structures details
    1. Chunk
    2. Bin
    3. Zone
    4. Global zone storage
  4. Functions Overview
    1. Malloc
    2. Free
    3. Realloc
  5. Documentation

📃 Guideline

This malloc implementation is different from the glibc. The constrain for this project are:

  • You can only use some allowed functions:

    1. mmap(2)
    2. munmap(2)
    3. getpagesize(3)
    4. getrlimit(2)
    5. The authorized functions within your libft (write(2) par exemple ;-) )
    6. The functions from libpthread
  • The project must be written in accordance with the 42 Norm

  • You have to “pre-allocate” some memory zones to store your “small”
    and “medium” malloc.
  • Each zone must contain at least 100 allocations.
    1. TINY mallocs, from 1 to n bytes, will be stored in N bytes big zones.<br>
    2. SMALL mallocs, from (n+1) to m bytes, will be stored in M bytes big zones.<br>
    3. LARGE mallocs, fron (m+1) bytes and more, will be stored out of zone,<br>
    4. which simply means with mmap(), they will be in a zone on their own.<br>

🔭 Data-Structures overview

Chunk

Each chunk has a payload (where the user data is stored) and meta-data about how big it is (via a size field in the chunk header).

Chunks payload address is what is returned to the user.

When a chunk is in use by the application, the only data that’s “remembered” is the size of the chunk.

You can use this size to find the next chunk in a bin.

Bin

Bins contains multiples chunks.

They contain only a specific type of chunks, either only SMALL chunks or only TINY chunks.

A bin can contain at least 100 chunks.

Each bins are the same size (e.g.: MAX_TINY_CHUNK_SIZE * 100 + headers).

Zone

Zones contains multipes bins.

Each zone is a freelist of all the bins in that specific zone.
In this implementation the zones are [TINY, SMALL, LARGE], but you could have more zones.

Global zone storage

This were all the zones are stored. This is a global variable. It can be access by malloc and free.

🔬 Data-Structures details

📄 Chunk

Chunks are 8 bytes aligned
The minimum size of a chunk is 2*sizeof(void*). Usually sizeof(void*) is sizeof(size_t).
You can find the size of your size_t with this command:

  1. echo | gcc -E -xc -include 'stddef.h' - | grep size_t

A chunk in use:

  1. - schema -
  2. |---|---|---|---|---|---|---|---|
  3. | size_of_the_chunk | P| P = Previous chunk is in use
  4. |---|---|---|---|---|---|---|---|
  5. | payload |
  6. | |
  7. | |
  8. | |
  9. |---|---|---|---|---|---|---|---|
  10. - example -
  11. |---|---|---|---|---|---|---|---|
  12. |x00 00 00 00 00 00 20| 01| Size = 32 bytes, P = true
  13. |---|---|---|---|---|---|---|---|
  14. | payload | Payload = User data
  15. | |
  16. | |
  17. | |
  18. |---|---|---|---|---|---|---|---|

In a free chunk we also have at the end the size of the chunk. That way we can navigate back in our bin.
We will use this way of navigation during the memory defragmentation process.
Where we want to combine free chunk with adjacent free chunks to “coalesce” them into larger chunks.

A free chunk:

  1. - schema -
  2. |---|---|---|---|---|---|---|---|
  3. | size_of_the_chunk | P|
  4. |---|---|---|---|---|---|---|---|
  5. | payload |
  6. | |
  7. | |
  8. |---|---|---|---|---|---|---|---|
  9. | size_of_the_chunk |
  10. |---|---|---|---|---|---|---|---|
  11. - example -
  12. |---|---|---|---|---|---|---|---|
  13. |x00 00 00 00 00 00 20| 01| Size = 32 bytes, P = true
  14. |---|---|---|---|---|---|---|---|
  15. | payload | Payload = User data
  16. | |
  17. | |
  18. |---|---|---|---|---|---|---|---|
  19. |x00 00 00 00 00 00 20| 00|
  20. |---|---|---|---|---|---|---|---|

Now let’s see an exemple with 3 adjacent chunks

  1. |---|---|---|---|---|---|---|---|
  2. |x00 00 00 00 00 00 20| 01| | Chunk 1
  3. |---|---|---|---|---|---|---|---| | Size = 32 bytes, previous chunk is used
  4. | payload | | Is now free
  5. | | |
  6. | | |
  7. |---|---|---|---|---|---|---|---|
  8. |x00 00 00 00 00 00 20| 01|
  9. |---|---|---|---|---|---|---|---|
  10. |x00 00 00 00 00 00 18| 00| | Chunk 2
  11. |---|---|---|---|---|---|---|---| | Size = 24 bytes, previous chunk is free
  12. | payload | | Is now in use
  13. | | |
  14. | | |
  15. |---|---|---|---|---|---|---|---|
  16. |x00 00 00 00 00 00 08| 03| | Chunk 3
  17. |---|---|---|---|---|---|---|---| | Size = 24 bytes, previous chunk is used
  18. | payload | | Is the last chunk of the bin
  19. |---|---|---|---|---|---|---|---|

In the last chunk we have P = 3, this mean that it’s the last chunk of the bin.
Because our chunk are aligned on 8 bytes, we can use the three least significant bits of our size to set some flags.

  1. Chunk 1 header in binary
  2. | LP
  3. |--------|--------|--------|--------|--------|--------|--------|--------|
  4. |00000000|00000000|00000000|00000000|00000000|00000000|00100000|00000001|
  5. |--------|--------|--------|--------|--------|--------|--------|--------|
  6. Chunk 3 header in binary
  7. | LP
  8. |--------|--------|--------|--------|--------|--------|--------|--------|
  9. |00000000|00000000|00000000|00000000|00000000|00000000|00001000|00000011|
  10. |--------|--------|--------|--------|--------|--------|--------|--------|

So the L flag mean that this is the last chunk in a bin.

📂 Bin

Bins are multiples of get_page_size(), so they are large enough to be allocated with mmap.

They have a header which contain two pointers.

One pointing to the next bin in that zone and one pointing to the previous bin.

Bins are created with only two chunk at first.

Chunkd will be divided at each new allocation needed.

The second chunk is the last chunk of the bin. This chunk will never hold user data

The first chunk in a bin will always have the flag F (first) set as True.

The last chunk in a bin will always have the flag L (last) set as True.

That way we will know when we hit the end of a bin when searching for memory.

When coalescing our chunks, if we stumble upon the first chunk we will do the following:

  • Test if it’s free.
  • Test if it’s adjacent to the last chunk of the bin.
    If the two conditions are true, this bin is empty and we can use munmap() to give it back to the system.

Let’s see a more visual explanation of a bin lifecycle:

  1. void *ptr = (void*)malloc(8);
  2. // create a bin
  3. | BIN HEADER | BIN PAYLOAD |
  4. | | first chunk |lastchnk|
  5. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  6. |*bck_ptr|*nxt_ptr| MAX|101| free memory to use | MAX|101| 0|010|
  7. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  8. // create a chunk of size 8
  9. | BIN HEADER | BIN PAYLOAD |
  10. | | first chunk | second chunk |lastchnk|
  11. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  12. |*bck_ptr|*nxt_ptr| 24|101| free memory | 24|101| 8|000| payload| 0|011|
  13. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  14. free(ptr);
  15. // set the chunk as free
  16. | BIN HEADER | BIN PAYLOAD |
  17. | | first chunk | second chunk |lastchnk|
  18. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  19. |*bck_ptr|*nxt_ptr| 24|101| free memory | 24|101| 8|000| 8|000| 0|010|
  20. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  21. // defragment the bin
  22. | BIN HEADER | BIN PAYLOAD |
  23. | | first chunk |lastchnk|
  24. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  25. |*bck_ptr|*nxt_ptr| MAX|101| free memory | MAX|101| 0|010|
  26. |--------|--------|--------|--------|--------|--------|--------|--------|--------|
  27. // The first chunk is free and is adjacent to the last chunk, so we can give this bin to munmap()

🗃 Zone

A zone is a freelist where each nodes is a bin.

  1. | BIN HEADER | PAYLOAD |
  2. |--------|--------|---------|
  3. |*bck_ptr|*nxt_ptr| chunks |
  4. |--------|--------|---------|
  5. |----------------------------------------------ZONE A-----------------------------------------------|
  6. |bin1_address = 0x001... |bin2_address = 0x002... |bin3_address = 0x003...
  7. | BIN HEADER | PAYLOAD | | BIN HEADER | PAYLOAD | | BIN HEADER | PAYLOAD |
  8. |--------|--------|---------| |--------|--------|---------| |--------|--------|---------|
  9. | 0|0x002...| chunks | <-> |0x001...|0x003...| chunks | <-> |0x002...| 0| chunks |
  10. |--------|--------|---------| |--------|--------|---------| |--------|--------|---------|

🗄 Global zone storage

It’s a simple array with the first pointer of each zone.

  1. enum zone {TINY, SMALL, LARGE, ZONE_COUNT};
  2. extern void g_zones[ZONE_COUNT];
  3. // [TINY_PTR, SMALL_PTR, LARGE_PTR]

🛠 Functions Overview

Malloc

Here is an overview of how this malloc implementation will work

1 - malloc

2 - find_space

3 - search_in_zone

4 - create_bin

5 - set_chunk_header

6 - set_bin_headers

Free

1 - free

1 - defragment

1 - bin_is_empty

Realloc

Realloc will be really simple in this implementation.

It will just call malloc to get a new space of the right size in memory.

Then we will do a memcopy from the previous memory to the new one.

Now we just free the old memory.

Then return the new pointer.

📚 Documentation

Glibc implementation of malloc()

https://sourceware.org/glibc/wiki/MallocInternals

Really well documented implementation of malloc from glibc

https://github.com/lattera/glibc/blob/master/malloc/malloc.c

Good slides on mmap, sbrk, malloc

https://pubweb.eng.utah.edu/~cs4400/malloc.pdf