项目作者: BinaryResearch

项目描述 :
Lepton is a Lightweight ELF Parsing Tool that was designed specifically for analyzing and editing binaries with damaged or corrupted ELF headers.
高级语言: Assembly
项目地址: git://github.com/BinaryResearch/lepton.git
创建时间: 2019-09-30T16:40:08Z
项目社区:https://github.com/BinaryResearch/lepton

开源协议:MIT License

下载


Overview

Lepton is a Lightweight ELF Parsing Tool that was designed specifically for analyzing and editing binaries with damaged or corrupted
ELF headers, such as:

  • extremely minimalist ELF files in which the entry point and program header table lie within the ELF header
  • binaries that have had the ELF header deliberately mangled as an anti-analysis method (crackmes or malware)

Development was prompted by the failure of other tools to parse some of the ELF binaries in
Muppetlabs’ “tiny” ELF file series.

When using Lepton to parse ELF binaries, one has access to every field in the ELF header as well as every field in every entry of the
program load table. Individual fields can be straightforwardly modified to repair corruption.

Lepton succeeds in cases where other parsers fail for two main reasons:

  1. When reading the ELF header and program header table, the fields are simply read without any assumptions about
    their correctness and without additional analysis. The main exceptions are the magic bytes and the value of the e_machine field; if the file
    being read is not an ELF file or the architecture is not supported, Lepton quits. The result is that that if the binary can be executed, it can also be
    parsed correctly by Lepton, regardless of the extent of the corruption in the ELF header.

  2. When reconstructing the ELF header, only the values in the fields read by the kernel when loading the binay into memory are considered correct;
    the values of the rest of the fields are derived from the fields required by the kernel or assigned standard values. For example, the endianness
    and architecture of the data in the file is derived from the value in the e_machine field, which must be correct in order for the binary to be
    loaded by the kernel.

Example scripts and test binaries are included in the repository.

Currently, only x86 and x86-64 binaries are supported, but support for additional architectures can be added very easily by creating a new
entry in the architectures dictionary in ELFStructures.py.

Usage

A detailed example can be found at Analyzing ELF Binaries with Malformed Headers Part 3 - Automatically Solving a Corrupted Keygenme with angr

Example use cases:

Editing Corrupted ELF Header Fields

One anti-analysis trick involving corrupting the ELF header is writing incorrect values to fields having to do with section information.
Some tools will subsequently fail to parse or load the binary. A concrete example of this is a “keygenme” crackme from
crackmes.one
that Ghidra (9.1-BETA_DEV_20190923) fails to load. The crackme file
is included with this repository, in the test_binaries folder.

Ghidra fails to correctly parse ELF header

Using readelf it can clearly be seen that the start of the section headers (e_shoff), size of the section headers (e_shentsize)
and the section header string table index (e_shstrndx) all hold bogus values:

  1. $ readelf -h keygenme_copy
  2. ELF Header:
  3. Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  4. Class: ELF64
  5. Data: 2's complement, little endian
  6. Version: 1 (current)
  7. OS/ABI: UNIX - System V
  8. ABI Version: 0
  9. Type: DYN (Shared object file)
  10. Machine: Advanced Micro Devices X86-64
  11. Version: 0x1
  12. Entry point address: 0x1320
  13. Start of program headers: 64 (bytes into file)
  14. Start of section headers: 65535 (bytes into file) <--------------
  15. Flags: 0x0
  16. Size of this header: 64 (bytes)
  17. Size of program headers: 56 (bytes)
  18. Number of program headers: 11
  19. Size of section headers: 64 (bytes)
  20. Number of section headers: 65535 <--------------
  21. Section header string table index: 65535 <corrupt: out of range> <--------------
  22. readelf: Error: Reading 4194240 bytes extends past end of file for section headers
  23. readelf: Error: Reading 14312 bytes extends past end of file for dynamic string table

These values can be overwritten such that Ghidra successfully imports the binary. In the script below,
all fields having to do with sections are zeroed out:

  1. #!/usr/bin/python3
  2. from lepton import *
  3. from struct import pack
  4. def main():
  5. with open("keygenme", "rb") as f:
  6. elf_file = ELFFile(f)
  7. # overwrite fields values with 0x00 bytes
  8. elf_file.ELF_header.fields["e_shoff"] = pack("<Q", 0)
  9. elf_file.ELF_header.fields["e_shentsize"] = pack("<H", 0)
  10. elf_file.ELF_header.fields["e_shnum"] = pack("<H", 0)
  11. elf_file.ELF_header.fields["e_shstrndx"] = pack("<H", 0)
  12. # output to file
  13. binary = elf_file.ELF_header.to_bytes() + elf_file.file_buffer[64:]
  14. with open("fixed_crackme", "wb") as f:
  15. f.write(binary)
  16. if __name__=="__main__":
  17. main()

Ghidra now successfully imports the binary and displays the new ELF header values as well:

Ghidra loads the binary after the ELF header is repaired


Recomposing a Corrupted Binary (experimental - failure)

Edit 5/6/2021: This fails in most cases because it breaks most, if not all, code offsets and relocations.

readelf completely fails to read tiny-i386, which is 45 bytes in size - smaller than the 52 bytes of a well-formed ELF32 header:

  1. $ readelf -h tiny-i386
  2. readelf: Error: tiny-i386: Failed to read file header

Lepton can be used to read the ELF header, as well as create a new binary holding the same information as the original but that can be parsed by readelf:

  1. #!/usr/bin/python3
  2. #read_and_recompose_tiny-i386.py
  3. from lepton import *
  4. def main():
  5. # raw headers
  6. with open("tiny-i386", "rb") as f:
  7. elf_file = ELFFile(f)
  8. print("\n\tRaw header field values:\n")
  9. elf_file.ELF_header.print_fields()
  10. # create new headers
  11. with open("tiny-i386", "rb") as f:
  12. elf_file = ELFFile(f, new_header=True) # create new, well-formed ELF header
  13. with open("repaired_tiny-i386", "wb") as f:
  14. f.write(elf_file.recompose_binary()) # moves the program header out of the file
  15. # header and recalculates the entry point
  16. print("\n\tRepaired header field values:\n")
  17. elf_file.ELF_header.print_fields() # call once entry point has been recalculated
  18. if __name__=="__main__":
  19. main()

When an ELFFile object is instantiated with the new_header argument set to True, a new ELF header and program header table are created
in which fields besides those having to do with sections are given standard values. Fields having to do with sections are filled in with 0x00 bytes.

The recompose_binary() function checks if the program header table and the ELF header table overlap. If so, the program header table is moved out of
the ELF header by copying all of the bytes between the original entry point and the end of the file into a buffer, appending this buffer to a correctly-formed
ELF header + program header table and then recalculating the entry point based on its new offset within the file.
(note that this function is exprerimental and as of right now often results in programs with corrupted logic that produce erroneous I/O or segfault,
but in this case it works.)

  1. $ python3 read_and_recompose_tiny-i386.py
  2. Raw header field values:
  3. E_IDENT: (127, 69, 76, 70, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0)
  4. Type: 0x2
  5. Machine: 0x3
  6. Version: 0x10020
  7. Entry point: 0x10020
  8. Program header table offset (bytes into file): 0x4
  9. Section header table offset (bytes into file): 0xc0312ab3
  10. Flags: 0x80cd40
  11. ELF header size (bytes): 52
  12. Program header table entry size: 32
  13. Number of entries in the program header table: 1
  14. [+] Null field encountered. File is smaller than expected header size [+]
  15. Repaired header field values:
  16. E_IDENT: (127, 69, 76, 70, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
  17. Type: 0x2
  18. Machine: 0x3
  19. Version: 0x1
  20. Entry point: 0x10054
  21. Program header table offset (bytes into file): 0x34
  22. Section header table offset (bytes into file): 0x0
  23. Flags: 0x0
  24. ELF header size (bytes): 52
  25. Program header table entry size: 32
  26. Number of entries in the program header table: 1
  27. Section header table entry size: 0
  28. Number of entries in the section header table: 0
  29. Number of entries in the string header index table: 0

The newly created ELF file is called repaired_tiny-i386. readelf can now parse it without choking:

  1. $ readelf -h repaired_tiny-i386
  2. ELF Header:
  3. Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  4. Class: ELF32
  5. Data: 2's complement, little endian
  6. Version: 1 (current)
  7. OS/ABI: UNIX - System V
  8. ABI Version: 0
  9. Type: EXEC (Executable file)
  10. Machine: Intel 80386
  11. Version: 0x1
  12. Entry point address: 0x10054
  13. Start of program headers: 52 (bytes into file)
  14. Start of section headers: 0 (bytes into file)
  15. Flags: 0x0
  16. Size of this header: 52 (bytes)
  17. Size of program headers: 32 (bytes)
  18. Number of program headers: 1
  19. Size of section headers: 0 (bytes)
  20. Number of section headers: 0
  21. Section header string table index: 0
  22. $ readelf -l repaired_tiny-i386
  23. Elf file type is EXEC (Executable file)
  24. Entry point 0x10054
  25. There is 1 program header, starting at offset 52
  26. Program Headers:
  27. Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
  28. LOAD 0x000000 0x00010000 0x00030002 0x10020 0x10020 R 0xc0312ab3

The runtime behavior of the new file is identical to the original:

  1. $ strace ./repaired_tiny-i386
  2. execve("./repaired_tiny-i386", ["./repaired_tiny-i386"], 0x7ffd19a0f1b0 /* 52 vars */) = 0
  3. strace: [ Process PID=5822 runs in 32 bit mode. ]
  4. exit(42) = ?
  5. +++ exited with 42 +++

More examples can be found in the programs in the scripts folder.

Test Binaries

The test binaries included in this repo are from Muppetlabs’ “tiny” series, as well as netspooky’s “golfclub” programs:

TODO

  • The ELFExceptions module needs work as it is very basic.

  • The Lepton classes and functions need to be documented