Skip to content

Add PEM command line flag for enforcing a max file size for ElfReader#2194

Merged
ddelnano merged 2 commits intopixie-io:mainfrom
ddelnano:ddelnano/add-file-size-cap-to-elf-reader
May 14, 2025
Merged

Add PEM command line flag for enforcing a max file size for ElfReader#2194
ddelnano merged 2 commits intopixie-io:mainfrom
ddelnano:ddelnano/add-file-size-cap-to-elf-reader

Conversation

@ddelnano
Copy link
Copy Markdown
Member

@ddelnano ddelnano commented May 6, 2025

Summary: Add PEM command line flag for enforcing a max file size for ElfReader

While memory profiling the PEM, I noticed that large binaries caused the PEM's ElfReader code to make > 100 MiB allocations. This PR adds a cli flag (--elf_reader_max_file_size) that allows memory conscious users to prevent the PEM from parsing binaries that would cause these large memory allocations.

Relevant Issues: N/A

Type of change: /kind feature

Test Plan: Existing tests and skaffold

Changelog Message: Added --elf_reader_max_file_size flag to PEM. This flag allows memory conscious users to prevent the PEM from parsing large binaries in order to avoid large memory allocations.

@ddelnano ddelnano requested a review from a team as a code owner May 6, 2025 05:37
Comment on lines +55 to +62
const std::string& binary_path, const std::filesystem::path& debug_file_dir = kDebugFileDir);

/**
* Creates an ElfReader that does not enforce the max file size limit. This is useful for cases
* where the binary size is known in advance or the binary must be loaded regardless of size.
*/
static StatusOr<std::unique_ptr<ElfReader>> CreateUncapped(
const std::string& binary_path, const std::filesystem::path& debug_file_dir = kDebugFileDir);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an API perspective, it might be simpler to have a single Create() where the size is an optional parameter that defaults to FLAGS_elf_reader_max_file_size. What do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to changing this, but I wanted to make the uncapped behavior an explicit opt-in for consumers of ElfReader. Since it's used in many places, I could see a future bug where the max file size check is unintentionally bypassed when most consumers should retain the restriction.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this offline, but the suggestion is that the optional parameter defaults to the capped version, so I don't think it exposes any risk. The goal is just to keep the API simpler. But this is a bit subjective, so doesn't need to be a blocker if you prefer the two separate API calls.

Comment on lines +202 to +211
if (FLAGS_elf_reader_max_file_size != 0) {
PX_ASSIGN_OR_RETURN(const auto stat, fs::Stat(binary_path));
int64_t file_size = stat.st_size;
if (file_size > FLAGS_elf_reader_max_file_size) {
return error::Internal(
"File size $0 exceeds ElfReader's max file size $1. Refusing to process file", file_size,
FLAGS_elf_reader_max_file_size);
} else if (file_size < 0) {
return error::Internal("stat() returned negative file size $0 for file $1", file_size,
binary_path);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put much of this in a helper function StatusOr GetFileSize(), which can return an error in the case of file_size < 0. Then you can do:

 PX_ASSIGN_OR_RETURN(int64_t file_size, GetFileSize(binary_path));
 if (file_size > FLAGS_elf_reader_max_file_size) {
    return error::Internal(
          "File size $0 exceeds ElfReader's max file size $1. Refusing to process file", file_size,
          FLAGS_elf_reader_max_file_size);
  }

which is a bit more focused.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've refactored this and relocated to the fs_wrapper code in 9d48a38.

@ddelnano ddelnano requested a review from a team as a code owner May 12, 2025 05:18
Comment on lines +55 to +62
const std::string& binary_path, const std::filesystem::path& debug_file_dir = kDebugFileDir);

/**
* Creates an ElfReader that does not enforce the max file size limit. This is useful for cases
* where the binary size is known in advance or the binary must be loaded regardless of size.
*/
static StatusOr<std::unique_ptr<ElfReader>> CreateUncapped(
const std::string& binary_path, const std::filesystem::path& debug_file_dir = kDebugFileDir);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this offline, but the suggestion is that the optional parameter defaults to the capped version, so I don't think it exposes any risk. The goal is just to keep the API simpler. But this is a bit subjective, so doesn't need to be a blocker if you prefer the two separate API calls.

@ddelnano ddelnano merged commit 54f8675 into pixie-io:main May 14, 2025
40 of 42 checks passed
@ddelnano ddelnano deleted the ddelnano/add-file-size-cap-to-elf-reader branch May 14, 2025 04:12
ddelnano added a commit to k8sstormcenter/pixie that referenced this pull request Feb 25, 2026
…pixie-io#2194)

Summary: Add PEM command line flag for enforcing a max file size for
ElfReader

While memory profiling the PEM, I noticed that large binaries caused the
PEM's ElfReader code to make > 100 MiB allocations. This PR adds a cli
flag (`--elf_reader_max_file_size`) that allows memory conscious users
to prevent the PEM from parsing binaries that would cause these large
memory allocations.

Relevant Issues: N/A

Type of change: /kind feature

Test Plan: Existing tests and skaffold

Changelog Message: Added `--elf_reader_max_file_size` flag to PEM. This
flag allows memory conscious users to prevent the PEM from parsing large
binaries in order to avoid large memory allocations.

---------

Signed-off-by: Dom Del Nano <[email protected]>
ddelnano added a commit to k8sstormcenter/pixie that referenced this pull request Feb 25, 2026
…pixie-io#2194)

Summary: Add PEM command line flag for enforcing a max file size for
ElfReader

While memory profiling the PEM, I noticed that large binaries caused the
PEM's ElfReader code to make > 100 MiB allocations. This PR adds a cli
flag (`--elf_reader_max_file_size`) that allows memory conscious users
to prevent the PEM from parsing binaries that would cause these large
memory allocations.

Relevant Issues: N/A

Type of change: /kind feature

Test Plan: Existing tests and skaffold

Changelog Message: Added `--elf_reader_max_file_size` flag to PEM. This
flag allows memory conscious users to prevent the PEM from parsing large
binaries in order to avoid large memory allocations.

---------

Signed-off-by: Dom Del Nano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants