[th/nft-mem] reorder code so we can release temporary object earlier during nft cmd by thom311 · Pull Request #1303 · firewalld/firewalld

thom311 · 2024-02-09T14:59:41Z

During nft command, we may (temporarily) create a lot of data to track the rules. This increases the memory usage, even if only for a short time.

Rework some code, so that we can drop references earlier when we no longer need them.

thom311 · 2024-02-12T18:52:10Z

previously, there was a patch here which dropped the chunking from nftables.set_restore(). That doesn't work (yet?). Patch was dropped (for now?).

thom311 · 2024-02-13T14:28:18Z

previously, there was a patch here which dropped the chunking from nftables.set_restore(). That doesn't work (yet?). Patch was dropped (for now?).

With patch improvement(nftables): combine request for entries of ipset, it works. Patch re-added.

erig0 · 2024-03-12T17:50:23Z

    return True


+def iter_reiterable(iterable):


Why have this function vs just calling list(thing) where you need it?

Yes. It makes a copy where you may avoid it in some cases, but this function is just obfuscating things.

Most callers probably are just lists, so re-iterating them is fine. It would be wasteful to clone the list for that. But more importantly, on the next review, somebody thinks "oh, this clone is unneecessary". Then I would have to add a code comment explaining why the clone is even there. At this point, I wouldn't clone the list.

But that makes the function less general and the caller needs to make sure not to mess up (in non-obvious ways). The function no longer accepts arbitrary iterables, only iterables that can be iterated more than once. So, maybe I should add a code comment there that indicates that this function can only be called with certain iterables. But all callers currently are lists (probably?), so I wouldn't want to add such a code comment either.

iter_reiterable() solves this. More than anything, it clearly expresses the intent of making the implementation safe. The caller no longer need to worry. The reviewer of the code also sees clearly why this is done, and can reason why this is the correct thing to do.

It doesn't matter whether all callers today(!!) pass a list(). Using iter_reiterable() in a function makes the function more accepting to input and less of a foot gun.

erig0 · 2024-03-12T18:05:55Z

+def removeFile(filename):
+    try:
+        os.remove(filename)
+    except OSError:


Remove (delete) the file path. If path is a directory, an OSError is raised.

Seems like we want propagate the error in this case.

https://docs.python.org/3/library/os.html

I think not. At least, not for the caller at hand.

The caller is nft_cmd(), and that one really does not want to be bothered with this. It certainly shouldn' t just let the exception escape, so, it would need to handle it. But how? Logging a message?

If the caller still needs to handle OSError (e.g. because the path is a file, or because the file has wrong permissions), then they should handle FileNotFoundError themselves. This function is not for them. This function is for "I don't care about OSError and don't want to write 4 extra lines of code for that" (and that's the only purpose).

If you think that nft_cmd() needs to handle IsADirectoryError (e.g. by logging a warning), then that's another thing.

erig0 · 2024-03-12T18:13:21Z

-        self.nftables = Nftables()
-        self.nftables.set_echo_output(True)
-        self.nftables.set_handle_output(True)
+        self._nft_handle = Nftables()


handle is confusing with rule handles. Maybe _nft_ctx is a better name? Or _nftables_obj.

I prefer to drop this commit. What are you grepping for? In this file "nftables" shows up more as a string than as an object/identifier.

I search for self._nft_handle (or whatever it's called).

It's in general good to name different things differently. Extra bonus points for unique names.

Rename to "_nft_ctx".

erig0 · 2024-03-12T19:06:49Z

+        if isinstance(json_root, list):
+            # The caller gave us the JSON in a list, so we can drop the
+            # reference to the memory. Unpack the list, and delete the entry.
+            l = json_root
+            (json_root,) = l
+            del l[0]


I don't see any cases where json_root would be a list. It's always a dict.

This code also doesn't work. If json_root is a list then you can't assign l to a tuple.

>>> foo=[1,2,3,4,5] >>> (blah,) = foo Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: too many values to unpack (expected 1)

Or json_root is a list of one element, then this does nothing.

I think it works as intended. (blah,) = foo is tuple unpacking, and unpacks an interable of one element. It checks that the number of element matches, which is the reason for doing this instead of blah = foo[0].

Or json_root is a list of one element, then this does nothing.

If the caller passes json_root as a list, then it MUST be a list of one element (otherwise, the tuple unpacking fails). But then it does the right thing. Unpack the list, and clear the element from the list.

I don't see any cases where json_root would be a list. It's always a dict.

yes, it was previously unused. I now added a two commits

improvement(nftables): avoid creating JSON string twice before nft_cmd()

improvement(nftables): free json data earlier during nft_cmd()

erig0 · 2024-03-12T19:20:05Z

+            (json_root,) = l
+            del l[0]
        json_root_str = json.dumps(json_root)
+        del json_root


I'm not sure this does anything if other areas of the code, e.g. the caller, are holding a reference. It would have to be json_root.clear() to remove the dict entries.

it would, if json_root argument were a list.

I added commit "improvement(nftables): free json data earlier during nft_cmd()" to do that.

erig0 · 2024-03-12T19:27:44Z

-        json_root_str = json_root_str.encode("utf-8")
-        rc, output, error = self._nft_handle.cmd(json_root_str)
+
+        if is_large and hasattr(self._nft_handle, "cmd_from_file"):


This hasattr() guard is weird. AFAIK, python-nftables has always had this function.

it's only "recently" added to the python library, here: https://git.netfilter.org/nftables/commit/?id=448bd104d3530ed2870921a35ebf54589cdb274c

This is recent enough, that github CI tests would break without this.

erig0 · 2024-03-12T19:29:26Z

+        else:
+            json_root_str = json.dumps(json_root)
+            del json_root
+            json_root_str = json_root_str.encode("utf-8")
+            rc, output, error = self._nft_handle.cmd(json_root_str)


Instead of adding complexity by sometimes doing it directly and sometimes doing it via a file, why not just always do it from a file?

I think it will be slower to write a small file. The branch change should not make some cases slower.

It was more about consistency and reducing complexity. One code path for all operations.

erig0 · 2024-03-12T19:35:23Z

This function is unused by the end of this changeset. Please drop this commit.

erig0 · 2024-03-12T19:36:51Z

There are a lot of changes here that I'm not convinced are providing real benefit. The ipset entry changes are for sure. But some of the explicit dels I am not convinced about.

I'd like to see the ipset entry change in a separate PR.
I'd like to see the cmd_from_file() change also in a separate PR.

Do you have any benchmarks about the improvements from del and clear() commands?

- extract a tempDir() function. - make the function configurable. In particular, all files created with the same prefix is bad, because if we have a left over file on disk, it's hard to tell where the file comes from.

It swallows IO exceptions. We often want to just get rid of a file, and don't bother with a failure to do so.

Most of the time, when the caller constructs a list of rules and calls set_rules(), the don'tt need the rules anymore. We could release them from memory earlier, right after set_rules() processed the list further. Add an argument "rules_clear" (default to False) to set_rules(). If true, set_rules() will clear the list after it no longer needs it. That may allow some memory to be reclaimed early.

…sing rules early These callers don't actually care about the list that they pass to set_rules(). By setting rules_clear=True, set_rules() will clear the list after using it, and thereby potentially already free some objects earlier, that we don't need anymore. The point is to reduce spikes in memory, by more eagerly release data after we no longer need it.

From libnftables we import nftables.nftables.Nftables. The "nftables.py" module brings a class nftables.nftables. Don't also call the Nftable attribute self.nftables. Instead, rename to "self._nft_ctx". The main point is using a distinct name, so we can grep for it.

We will rework the invocation of the JSON command, by writing the JSON to file. For that, move the call to self._nft_ctx.json_cmd() to a new helper. Note that libnftables' json_cmd() is itself just a small wrapper around json.dumps() and Nftables.cmd(). As we want to get close to a place where we don't generate a JSON string in memory, take the json_cmd() implementation and re-implement it. Later we will modify it further.

This allows to first `del json_blob` and free the data (if any). Before we start allocating new data while parsing the JSON response.

Instead of passing the JSON to libnftables, write the JSON to a file in /run/firewalld, and use cmd_from_file() to load it from there. The ultimate goal is to reduce the amount of memory we keep around. Note that the test don't create a separate mount namespace and don't remount /run. They maybe should do that. But with this change, we have yet another place where we write temporary files to /run. And if we run tests in parallel, they could theoretically interfere. In practice they don't, because the temporary files have unique names. In any case, if this is a problem, the solution is that tests mount a private /run/firewalld.

When we configure real rules, we request the handles from the nft call. Otherwise, that data is not used. Only request echo output, when we actually use the information.

Previously, when creating an ipset, we would create for each entry a separate "add" rule, like { "add": { "element": { "family": "inet", "table": "firewalld", "name": "foobar", "elem": [ { "prefix": { "addr": "10.10.1.1", "len": 32 } } ] } } }, { "add": { "element": { "family": "inet", "table": "firewalld", "name": "foobar", "elem": [ { "prefix": { "addr": "10.10.1.2", "len": 32 } } ] } } }, ... This is much slowing in libnftables (or kernel) to process, then having all entries at once. Combine them, now we have: { "add": { "element": { "family": "inet", "table": "firewalld", "name": "foobar", "elem": [ { "prefix": { "addr": "10.10.1.1", "len": 32 } }, { "prefix": { "addr": "10.10.1.2", "len": 32 } }, ...

The problem with chunking is that it splits the transaction, and what should be an atomic update, is not. libnftables is supposed to provide us with a concept of transactions, if we cannot use that because the transaction is too large, then that's a fail. We changed the format of the JSON for ipset entries, instead of an "add" command for each entry, they are now all combed in one. This significantly makes the operation faster, and we shouldn't need the chunking anymore. This in parts reverts 9a86f11 ('fix(ipset): chunk entries when restoring set'), however, in the meantime it is thought that the chunking is no longer necessary.

The JSON data may be large, and we have to convert it to string. Optimally, we free it as soon as we no longer need it. Previously, if nft_cmd() failed, we would raise a ValueError() with the full JSON blob as error message. That means, we had to keep the JSON blob around for longer. It's questionable whether we really need this. Note that if you run with debug logging, then we log the full JSON before making the call. If we don't have debug logging enabled, then we shouldn't embed the full JSON data in our error message. By not embedding the JSON string in the ValueError() after failure, we can get rid of it earlier.

If we already convert the JSON to string for logging purposes, don't do it a second time for nft_cmd(). Instead, reuse the string.

thom311 · 2024-03-14T12:40:13Z

Do you have any benchmarks about the improvements from del and clear() commands?

No. You brought up how the ipset transaction has to be split due to memory (peak) usage. The deletes and clears allow it that memory that is no longer needed gets released earlier. I think they are successful at that, because (from review) it is clear that the variables are not aliased (have no other references). Also, CPython uses reference counting (and a GC for reference loops), so once those (loopless) variables get freed right away. Does it measurably help performance? That is hard to say.

Equally important, does it hurt readability/maintainability? I don't think so. A del reduces the scope of a variable (which you otherwise cannot do in Python). Reducing the scope is IMO on its own a good thing (for maintainability), because it becomes clear that a variable shall not be used after that point. For example, nft_cmd() function is 41 lines, easy to understand and maintain. Dropping the dels there, doesn't make it significantly easier.

Too bad, that flake8 (currently) cannot catch using a variable after it was deleted. However, mypy can (which we should enable).

thom311 force-pushed the th/nft-mem branch 3 times, most recently from a34b3af to 2a86f73 Compare February 12, 2024 18:51

thom311 force-pushed the th/nft-mem branch 3 times, most recently from f61d03c to 4658174 Compare February 13, 2024 14:27

thom311 requested a review from erig0 February 13, 2024 18:52

thom311 force-pushed the th/nft-mem branch 2 times, most recently from c207bc9 to ea7be70 Compare February 14, 2024 09:38

erig0 requested changes Mar 12, 2024

View reviewed changes

thom311 added 15 commits March 14, 2024 12:03

improvement(functions): add iter_split_every() helper

8fbf7be

improvement(functions): add iter_reiterable() helper

89e6e47

improvement(functions): improve firewall.functions.tempFile() helper

a5e4e0a

- extract a tempDir() function. - make the function configurable. In particular, all files created with the same prefix is bad, because if we have a left over file on disk, it's hard to tell where the file comes from.

improvement(functions): add firewall.functions.removeFile() helper

af8a23f

It swallows IO exceptions. We often want to just get rid of a file, and don't bother with a failure to do so.

improvement(nftables): parse nft output later to save memory

8985383

This allows to first `del json_blob` and free the data (if any). Before we start allocating new data while parsing the JSON response.

improvement(nftables): avoid nft output when it's not needed

0cb599a

When we configure real rules, we request the handles from the nft call. Otherwise, that data is not used. Only request echo output, when we actually use the information.

improvement(nftables): avoid creating JSON string twice before nft_cmd()

0035997

If we already convert the JSON to string for logging purposes, don't do it a second time for nft_cmd(). Instead, reuse the string.

thom311 force-pushed the th/nft-mem branch from ea7be70 to 0035997 Compare March 14, 2024 11:26

Conversation

thom311 commented Feb 9, 2024

Uh oh!

thom311 commented Feb 12, 2024

Uh oh!

thom311 commented Feb 13, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erig0 commented Mar 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thom311 commented Mar 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erig0 commented Mar 12, 2024 •

edited

Loading