Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support macOS 12 dyld shared cache format #358

Closed
mstange opened this issue Aug 18, 2021 · 8 comments · Fixed by #398
Closed

Support macOS 12 dyld shared cache format #358

mstange opened this issue Aug 18, 2021 · 8 comments · Fixed by #398

Comments

@mstange
Copy link
Contributor

mstange commented Aug 18, 2021

On macOS 12 the dyld shared cache format seems to have changed. At least this is the case in the Beta that I have running at the moment.

Instead of one cache file per architecture, the cache is now split up into multiple files, each being a little over 700MB. The filenames are:

/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3

(and similarly for the other architectures)

The old list of images (currently parsed by DyldCache::images) in each file is now empty.

Instead, the header points to a new list of images, let's call them images_across_all_chunks. For a given architecture, all cache chunk files have the same full list of images_across_all_chunks, with the same image addresses.

For example the x86_64 cache has the following list, in all cache chunk files:

<image_address> <path>
0x7ff800050000 /usr/lib/system/libsystem_blocks.dylib
0x7ff800052000 /usr/lib/system/libxpc.dylib
[...]
0x7ffb1bc46000 /System/iOSSupport/usr/lib/swift/libswiftWebKit.dylib
0x7ffb1bc4a000 /System/iOSSupport/usr/lib/swift/~libswiftPencilKit.dylib

However, the cache files differ in the mappings. It looks like each chunk covers a 4 GiB range of virtual addresses.

/System/Library/dyld/dyld_shared_cache_x86_64:
0x7ff800000000-0x7ff81fd10000 at file offset 0x0
0x7ff840000000-0x7ff8418c8000 at file offset 0x1fd10000
0x7ff8418c8000-0x7ff843664000 at file offset 0x215d8000
0x7ff880000000-0x7ff88bf08000 at file offset 0x23374000

/System/Library/dyld/dyld_shared_cache_x86_64.1:
0x7ff900000000-0x7ff9201e0000 at file offset 0x0
0x7ff940000000-0x7ff942878000 at file offset 0x201e0000
0x7ff942878000-0x7ff944664000 at file offset 0x22a58000
0x7ff980000000-0x7ff98cac4000 at file offset 0x24844000

/System/Library/dyld/dyld_shared_cache_x86_64.2:
0x7ffa00000000-0x7ffa1ff44000 at file offset 0x0
0x7ffa40000000-0x7ffa42304000 at file offset 0x1ff44000
0x7ffa42304000-0x7ffa44f70000 at file offset 0x22248000
0x7ffa80000000-0x7ffa88ea0000 at file offset 0x24eb4000

/System/Library/dyld/dyld_shared_cache_x86_64.3:
0x7ffb00000000-0x7ffb1bc58000 at file offset 0x0
0x7ffb40000000-0x7ffb41254000 at file offset 0x1bc58000
0x7ffb41254000-0x7ffb43d98000 at file offset 0x1ceac000
0x7ffb80000000-0x7ffb8adbc000 at file offset 0x1f9f0000

The images_across_all_chunks list can be accessed from the header like this:

#[derive(Debug, Clone, Copy)]
#[repr(C)]
pub struct DyldCacheHeader<E: Endian> {
    /// e.g. "dyld_v0    i386"
    pub magic: [u8; 16],
    /// file offset to first dyld_cache_mapping_info
    pub mapping_offset: U32<E>,
    /// number of dyld_cache_mapping_info entries
    pub mapping_count: U32<E>,
    /// file offset to first dyld_cache_image_info
    pub images_offset: U32<E>,
    /// number of dyld_cache_image_info entries
    pub images_count: U32<E>,
    /// base address of dyld when cache was built
    pub dyld_base_address: U64<E>,
    ///
    reserved: [u8; 408],
    /// file offset to first dyld_cache_image_info
    pub images_across_all_chunks_offset: U32<E>,
    /// number of dyld_cache_image_info entries
    pub images_across_all_chunks_count: U32<E>,
}

The mapping_offset in the build I have is 456. This is also the size of the header at the same time. (The list of mappings starts right after the header.)

@mstange
Copy link
Contributor Author

mstange commented Aug 18, 2021

It's probably a good idea to wait for macOS 12 to be released and for the new dyld source to become available, before making any changes to object.

This issue is just a heads-up.

@mstange
Copy link
Contributor Author

mstange commented Nov 6, 2021

The implementation I suggested above doesn't seem to work in all cases. For example, for arm64e libsystem_malloc.dylib, it looks like two chunks are needed in order to get symbol names: The list of symbols is in the first chunk and the string table is in the second chunk.

@mstange mstange changed the title Dyld shared cache parsing will likely need changes for macOS 12 Support macOS 12 dyld shared cache format Nov 6, 2021
@mstange
Copy link
Contributor Author

mstange commented Nov 6, 2021

The strings are in the chunk which contains the __LINKEDIT segment.
(It looks like chunks are called "sub caches".)

All images in the arm64e dyld shared cache share the same __LINKEDIT segment, and it's in the .1 subcache. For example, on macOS 12.0.1, all images (in both subcaches) have a __LINKEDIT segment with the following address range:

0x21018c000-0x233c039fd at file offset 0x2a7f4000

And the subcache mappings are as follows:

dyld_shared_cache_arm64e:
0x180000000-0x1cf73c000 at file offset 0x0
0x1d173c000-0x1d4334000 at file offset 0x4f73c000
0x1d6334000-0x1d8f98000 at file offset 0x52334000
0x1d8f98000-0x1da6d0000 at file offset 0x54f98000
0x1da6d0000-0x1dd96c000 at file offset 0x566d0000
0x1df96c000-0x1df998000 at file offset 0x5996c000

dyld_shared_cache_arm64e.1:
0x1df998000-0x204628000 at file offset 0x0
0x206628000-0x2073a0000 at file offset 0x24c90000
0x2093a0000-0x20bb18000 at file offset 0x25a08000
0x20bb18000-0x20c81c000 at file offset 0x28180000
0x20c81c000-0x20d9e8000 at file offset 0x28e84000
0x20f9e8000-0x2371dc000 at file offset 0x2a050000

mstange added a commit to mstange/object that referenced this issue Nov 7, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

This has some unfortunate implications on the API: The DyldSharedCache
API now requires the data for all subcaches to be supplied to the
constructor, and the File::parse_at API now receives two "data"
arguments: One which contains the __LINKEDIT segment and one which
contains the other segments. The symtab and str information is read from
the data that contains the __LINKEDIT segment.
It is possible that there are other things that need to be read from
the __LINKEDIT segment.
This also adds an assumption that the __LINKEDIT segment is the only
segment that could be split out. I don't know if that's an ok assumption
to make. It's also an assumption that's not checked at the moment; a
check for this could be added.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```
@mstange
Copy link
Contributor Author

mstange commented Nov 7, 2021

I have this working now.

mstange added a commit to mstange/object that referenced this issue Nov 7, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

This has some unfortunate implications on the API: The DyldCache API
now requires the data for all subcaches to be supplied to the
constructor, and the File::parse_at API now receives two "data"
arguments: One which contains the __LINKEDIT segment and one which
contains the other segments. The symtab and str information is read from
the data that contains the __LINKEDIT segment.
It is possible that there are other things that need to be read from
the __LINKEDIT segment.
This also adds an assumption that the __LINKEDIT segment is the only
segment that could be split out. I don't know if that's an ok assumption
to make. It's also an assumption that's not checked at the moment; a
check for this could be added.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```
mstange added a commit to mstange/object that referenced this issue Nov 7, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

This has some unfortunate implications on the API: The DyldCache API
now requires the data for all subcaches to be supplied to the
constructor, and the File::parse_at API now receives two "data"
arguments: One which contains the __LINKEDIT segment and one which
contains the other segments. The symtab and str information is read from
the data that contains the __LINKEDIT segment.
It is possible that there are other things that need to be read from
the __LINKEDIT segment.
This also adds an assumption that the __LINKEDIT segment is the only
segment that could be split out. I don't know if that's an ok assumption
to make. It's also an assumption that's not checked at the moment; a
check for this could be added.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```
mstange added a commit to mstange/object that referenced this issue Nov 7, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

This has some unfortunate implications on the API: The DyldCache API
now requires the data for all subcaches to be supplied to the
constructor, and the File::parse_at API now receives two "data"
arguments: One which contains the __LINKEDIT segment and one which
contains the other segments. The symtab and str information is read from
the data that contains the __LINKEDIT segment.
It is possible that there are other things that need to be read from
the __LINKEDIT segment.
This also adds an assumption that the __LINKEDIT segment is the only
segment that could be split out. I don't know if that's an ok assumption
to make. It's also an assumption that's not checked at the moment; a
check for this could be added.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```
@blacktop
Copy link

blacktop commented Nov 7, 2021

It works with ipsw, but I know Golang isn't pleasant to read to some people;)

@mstange
Copy link
Contributor Author

mstange commented Nov 7, 2021

Thanks for the pointer! This suggests that I should also add support for the .symbols subcache. Symbol subcaches seem to be only used on iOS, not on macOS.

For reference, ipsw's dsc header definition is here: https://github.com/blacktop/ipsw/blob/05af7d2bde570035b25e7bfdf316d9f363c0106c/hack/extras/Dyld.bt#L116-L124

(I've found a small issue with that header definition, filed as blacktop/ipsw#49.)

@blacktop
Copy link

blacktop commented Nov 7, 2021

Yes on the non-macOS caches all the local symbols are in the .symbol subCache.

mstange added a commit to mstange/object that referenced this issue Nov 25, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```
mstange added a commit to mstange/object that referenced this issue Nov 25, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
mstange added a commit to mstange/object that referenced this issue Nov 26, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
mstange added a commit to mstange/object that referenced this issue Nov 26, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
mstange added a commit to mstange/object that referenced this issue Nov 26, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
mstange added a commit to mstange/object that referenced this issue Nov 26, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
mstange added a commit to mstange/object that referenced this issue Nov 27, 2021
Fixes gimli-rs#358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
philipc pushed a commit that referenced this issue Nov 27, 2021
Fixes #358.

This adds support for the dyld cache format that is used on macOS 12 and
iOS 15. The cache is split over multiple files, with a "root" cache
and one or more subcaches, for example:

```
/System/Library/dyld/dyld_shared_cache_x86_64
/System/Library/dyld/dyld_shared_cache_x86_64.1
/System/Library/dyld/dyld_shared_cache_x86_64.2
/System/Library/dyld/dyld_shared_cache_x86_64.3
```

Additionally, on iOS, there is a separate .symbols subcache, which
contains local symbols.

Each file has a set of mappings. For each image in the cache, the
segments of that image can be distributed over multiple files: For
example, on macOS 12.0.1, the image for libsystem_malloc.dylib for the
arm64e architecture has its __TEXT segment in the root cache and the
__LINKEDIT segment in the .1 subcache - there's a single __LINKEDIT
segment which is shared between all images across both files. The
remaining libsystem_malloc.dylib segments are in the same file as the
__TEXT segment.

The DyldCache API now requires the data for all subcaches to be supplied
to the constructor.

The parse_at methods have been removed and been replaced with a
parse_dyld_cache_image method.

With this patch, the following command outputs correct symbols for
libsystem_malloc.dylib:

```
cargo run --release --bin objdump -- /System/Library/dyld/dyld_shared_cache_arm64e /usr/lib/system/libsystem_malloc.dylib
```

Support for local symbols is not implemented. But, as a first step,
DyldCache::parse requires the .symbols subcache to be supplied (if the
root cache expects one to be present) and checks that its UUID is correct.
MachOFile doesn't do anything with ilocalsym and nlocalsym yet, and we
don't yet have the struct definitions for dyld_cache_local_symbols_info
and dyld_cache_local_symbols_entry.
@mstange
Copy link
Contributor Author

mstange commented Feb 8, 2022

The source for macOS Monterey dyld has been released now.

The new fields which I called images_across_all_subcaches_offset/_count are just called imagesOffset/Count, with the old fields renamed to imagesOffsetOld / imagesCountOld:
https://github.com/apple-oss-distributions/dyld/blob/5c9192436bb195e7a8fe61f22a229ee3d30d8222/cache-builder/dyld_cache_format.h#L99-L100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants