Skip to content

Config parser separation#5134

Merged
pks-t merged 6 commits intolibgit2:masterfrom
pks-t:pks/config-parser-separation
Jul 11, 2019
Merged

Config parser separation#5134
pks-t merged 6 commits intolibgit2:masterfrom
pks-t:pks/config-parser-separation

Conversation

@pks-t
Copy link
Member

@pks-t pks-t commented Jun 21, 2019

This is the missing piece to really separate the configuration parser and the configuration file backend. Goal is to make the code easier to understand as well as to enable easier refactoring. But I guess the result speaks for itself.

Note: this PR includes the two commits from #5132. I wanted to avoid having to fix conflicts, as I expect that #5132 will land quite fast.

@pks-t pks-t force-pushed the pks/config-parser-separation branch from cbd401e to e8852a5 Compare June 24, 2019 16:14
@pks-t
Copy link
Member Author

pks-t commented Jun 24, 2019

Disentangled this from #5132

@pks-t
Copy link
Member Author

pks-t commented Jun 27, 2019

Any comments? I'd like to fast track this as it will cause conflicts with the other config updates I have in flight

Copy link
Contributor

@tiennou tiennou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minor nitpicks only, though I'm somewhat interested about the includes moving down one level. Isn't that something that somehow "in-memory" would want ?

git_config_parser_comment_cb on_comment,
git_config_parser_eof_cb on_eof,
void *data)
void *payload)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


typedef struct {
git_config_file *file;
const char *configname;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_path ? I think configname dilutes the aspect that it's just a filesystem "token" or something…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly wanted to avoid this. Backends may not use files at all, most importantly the memory backend doesn't care for filenames. I thus wanted to have this be generic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our "outer layer" doesn't really know about anything other than files, and will generally needs a "label" that will usually looks like a file path. It's just about labelling that data, be it file-backed or in-memory. file_identifier, config_label, or something else ?

Really, it's because configname sounds like a "configuration preset name", not like "the user-visible name of the configuring thing".

Feel free to keep it as is though, just wanted to see if some bikeshedding would make it read nicer 😉.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Would you be happy with config_identifier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config_identifier it is then. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it some more. Maybe we should just leave this at file or file_path. At least it seems like an obvious route to go at a later point when implementing in-memory includes.

@pks-t
Copy link
Member Author

pks-t commented Jul 11, 2019

LGTM, minor nitpicks only, though I'm somewhat interested about the includes moving down one level. Isn't that something that somehow "in-memory" would want ?

Do you mean to say that there should be "config_file.h" and "config_mem.h" headers?

@tiennou
Copy link
Contributor

tiennou commented Jul 11, 2019

My "includes moving down one level" was about this: I seem to understand that there would be no way now (if there had ever been) to build a complete in-memory configuration comprised of multiple includes, because that include mechanism was just moved down to the file-backend only part, and/or was never extended to refer to non-file-backed includes ?

@pks-t
Copy link
Member Author

pks-t commented Jul 11, 2019

My "includes moving down one level" was about this: I seem to understand that there would be no way now (if there had ever been) to build a complete in-memory configuration comprised of multiple includes, because that include mechanism was just moved down to the file-backend only part, and/or was never extended to refer to non-file-backed includes ?

Oh, that's what you mean by includes! Thought you referred to C header includes.

Yeah, the memory backend is completely unable to handle this. The problem is that in order to compute includes, you need to compute paths relative to the parsed file's location. As the memory-stuff doesn't have any associated path per se, I simply made them not honor includes at all. That being said, I'm not opposed at all to that at a later point.

@tiennou
Copy link
Contributor

tiennou commented Jul 11, 2019

Thanks for the clarification 👍. I had a hunch it wasn't really supported.

@pks-t pks-t force-pushed the pks/config-parser-separation branch from e8852a5 to 3001018 Compare July 11, 2019 08:57
@pks-t
Copy link
Member Author

pks-t commented Jul 11, 2019

Rebased to fix conflicts with #5132, renamed configname to path

pks-t added 6 commits July 11, 2019 10:58
By convention, parameters that get passed to callbacks are
usually named `payload` in our codebase. Rename the `data`
parameters in the configuration parser callbacks to `payload` to
avoid confusion.
The config file code needs to keep track of the actual
`git_config_file` structure, as it not only contains the path
of the current configuration file, but it also keeps tracks of
all includes of that file. Right now, we keep track of that
structure via the `git_config_parser`, but as that's supposed to
be a backend generic implementation of configuration parsing it's
a layering violation to have it in there.

Switch over the config file backend to use its own config file
structure that's embedded in the backend parse data. This allows
us to switch over the generic config parser to avoid using the
`git_config_file` structure.
The config parser code needs to keep track of the current parsed
file's name so that we are able to provide proper error messages
to the user. Right now, we do that by storing a `git_config_file`
in the parser structure, but as that is a specific backend and
the parser aims to be generic, it is a layering violation.

Switch over to use a simple string to fix that.
With the previous commits, we have finally separated the config
parsing logic from the specific configuration file backend. Due
to that, we can now move the `git_config_file` structure into the
config file backend's implementation so that no other code may
accidentally start using it again. Furthermore, we rename the
structure to `diskfile` to make it obvious that it is internal,
only, and to unify it with naming scheme of the other diskfile
structures.
Error handling in `config_write` is rather convoluted and does
not match our current code style. Refactor it to make it easier
to understand.
Right now, all configuration file backends are expected to
directly mess with the configuration parser's internals in order
to set it up. Let's avoid doing that by implementing both a
`git_config_parser_init` and `git_config_parser_dispose` function
to clearly define the interface between configuration backends
and the parser.

Ideally, we would make the `git_config_parser` structure
definition private to its implementation. But as that would
require an additional memory allocation that was not required
before we just live with it being visible to others.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants