pfelf: add (Section||Prog).MmapData() #450

florianl · 2025-04-24T11:54:05Z

Add MmapData() that returns mmap backed Data of the respective Prog or Section.

Possible option as described in #352 (comment) to mmap .gopclntab.

Add MmapData() that returns mmap backed Data of the respective Prog or Section. Possible option as described in #352 (comment) to mmap `.gopclntab`. Signed-off-by: Florian Lehner <florian.lehner@elastic.co>

fabled · 2025-04-26T12:05:44Z

Add MmapData() that returns mmap backed Data of the respective Prog or Section.

Possible option as described in #352 (comment) to mmap .gopclntab.

So designing the API for mmapping is the critical portion of the problem. And I would prefer not to special case it to .gopclntab handling. I believe mmaping any shared object (also PE for dotnet unwinder) should be favored if:

the file is large
we know that there will be several reads done on the file
mmap can be done

In the above case it would probably make sense to mmap the whole file in one go, and have all of pfelf and its existing methods return slices to the mmapped memory. This would significantly reduce memory overhead on handling any large ELF file. Also the pfelf internally should use mmapped data for symbol lookups etc, if possible. IIRC, just converting the current virtual memory read APIs to read from mmap area should be sufficient for this.

In otherwords, the caller of pfelf should not care if mmap or read is being used to access the file. That should be internal implementation detail. And the high level API should work identically regardless of which method of access is used.

Or does anyone see problems with the above approach?

I believe the main issues to sort would also be:

using lot of address space (but should be still insignificant amount on 64-bit architectures, and I think we are not looking to support 32-bits?)
not sure if there is usage that the pfelf returned byte arrays are kept beyond calling Close on the pfelf? (this should not be a problem if using exp/mmap which uses finalizer to unmap)
do we need to care if there's two separate pfelf instances mmapping the same file?

Also unix.Mmap internally requires locking and registering mappings in a go map. I wonder if we could instead use exp/mmap (https://pkg.go.dev/golang.org/x/exp/mmap) which does not require this and seems to provide a bit more useful API in the first place.

florianl · 2025-05-02T07:13:50Z

Thanks for the feedback 🙏

I agree, that the API of pfelf should not change and if the file is mmaped or not should be an internal implementation detail. Also switching to exp/mmap sounds reasonable to me and also expanding the approach so that other use cases can benefit from it as well. The idea with this first approach was, to figure out the constraints, limits and potential issues with mmap.

pfelf: add (*Section||*Prog).MmapData()

c5c0f20

Add MmapData() that returns mmap backed Data of the respective Prog or Section. Possible option as described in #352 (comment) to mmap `.gopclntab`. Signed-off-by: Florian Lehner <florian.lehner@elastic.co>

florianl force-pushed the issue-352 branch from 25a690f to c5c0f20 Compare April 24, 2025 12:04

florianl closed this May 2, 2025

florianl mentioned this pull request May 6, 2025

pfelf: use mmap to access Prog() and Data() #455

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pfelf: add (Section||Prog).MmapData() #450

pfelf: add (Section||Prog).MmapData() #450

florianl commented Apr 24, 2025

fabled commented Apr 26, 2025 •

edited

Loading

florianl commented May 2, 2025

pfelf: add (*Section||*Prog).MmapData() #450

pfelf: add (*Section||*Prog).MmapData() #450

Conversation

florianl commented Apr 24, 2025

fabled commented Apr 26, 2025 • edited Loading

florianl commented May 2, 2025

pfelf: add (Section||Prog).MmapData() #450

pfelf: add (Section||Prog).MmapData() #450

fabled commented Apr 26, 2025 •

edited

Loading