Add definitions of macros to diffkemp-out.yaml#347
Add definitions of macros to diffkemp-out.yaml#347PLukas2018 wants to merge 2 commits intodiffkemp:masterfrom
diffkemp-out.yaml#347Conversation
| self.function_names = set() | ||
| self.macro_names = set() | ||
| # Note: type_names contains tuples (type name, parent name) | ||
| # parent name - name of function in which type is used |
There was a problem hiding this comment.
Relevant comment from #314:
Could be a macro, right?
Suggested change
# parent name - name of function in which type is used # parent name - name of function/macro in which type is used
I think, that currently, we are reporting type differences only from functions. Even if the type was originally used in a macro, the call stacks report type differences from the function in which the macro with the differing type was used. So the parent name (meaning the name of the previous call from the call stack) for type/struct should always be a function.
FrNecas
left a comment
There was a problem hiding this comment.
Looks good as far as I can tell, just a minor nitpick (obviously I may be missing some context of the previous PRs related to macros)
diffkemp/output.py
Outdated
| if called_res.first.callstack: | ||
| macro_defs, diff_macro_info = called_res.first.callstack \ | ||
| .get_macro_defs(compared_fun) | ||
| self._add_definitions(macro_defs, "old") |
There was a problem hiding this comment.
Hm, I am not a big fan of the "old", "new", "kind" string literal hard-coding. It would be nice if we could use a string enum for this instead but I am not sure which python versions diffkemp aims to support (we are still supporting something old, aren't we?). If that is the case, feel free to ignore this.
There was a problem hiding this comment.
Thanks for the review. I replaced the old/new with boolean parameter is_old. For the kind I used enum which inherits from str, Enum (this should be available from Python 3.4), I could use StrEnum but it is supported from Python 3.11 and as you said I am also not sure which Python versions we want to support.
Add a new class (`MacroDefinitions`) for extracting definitions of macros so they can be saved into `diffkemp-out.yaml` file. When processing call stacks definitions of macros (all except the differing macro) are extracted from them. Also, the name of the differing macro is extracted from the call stacks and the name of the function/vertex (which should contain detailed information about the macro). Using this information the definition of differing macro is then extracted from the `SyntaxDiff` object saved in vertex of ComparisonGraph. `_create_def_info` method of the `YamlOutput` class was changed to function so it can also be used from the new class (`MacroDefinitions`). Unit tests for the `YamlOutput` class are updated with testing/checking definitions of macros. Because the test fixture in `conftest.py` file does not contain an absolute path to files for symbols it was necessary to mock some `os.path` functions. Also by adding the definitions to the `diffkemp-out.yaml` it is now possible to see the code of the macros in the result viewer.
Replacing hard-coded strings for representing kind of symbols with string enum. Note: In case we will be supporting pyhon 3.11+ we could use `StrEnum` instead of `str, Enum`.
d34abbb to
e45732f
Compare
viktormalik
left a comment
There was a problem hiding this comment.
TBH, I'm quite confused by the new code and it seems over-complicated to me at this point. I'm not saying that it's not the best way, it's just that I'm failing to see why that's the case.
| self.new_dir = snapshot_dir_new | ||
| self.result = result | ||
| # Information from call stacks for extracting defs for differing macros | ||
| # - list of tuples (vertex name, differing/last macro name). |
There was a problem hiding this comment.
Two remarks:
- Is this a list or a set?
- What does the vertex name correspond to? Maybe it's worth specifying.
| self.function_names = set() | ||
| self.macro_names = set() | ||
| # Note: type_names contains tuples (type name, parent name) | ||
| # parent name - name of function in which type is used |
There was a problem hiding this comment.
Maybe we could call it caller name or caller function name? We probably wouldn't even need the additional explanation comment, then.
| if called_res.second.callstack else [] | ||
| # updating symbol names | ||
| if called_res.first.callstack: | ||
| # Note: Macro names are not currently used. |
There was a problem hiding this comment.
Should we drop the field, then? Also an explanation of why the macros are handled differently from functions and types would be very helpful.
There was a problem hiding this comment.
Also an explanation of why the macros are handled differently from functions and types would be very helpful.
- So for functions, the location of the definition (file and line where the definition is located) is saved directly in
Vertexclass/object (class representing cached results for a function) inComparisonGraph. - For types the location of the definition is cached directly in
TypeDiffclass/object which is accessible fromVertexclass/object (from the function which contains/uses the differing type). - In the case of the macros, the location of definitions is not cached directly but it is possible to extract them from macros
callstackinSyntaxDiffclass which is accessible fromVertexclass (function which uses macros with differences) or to extract them from the final createdcallstack(which I currently do). For the differing macro is the location of the definition not possible to extract fromcallstackbut it is cached inSyntaxDiff(PR Caching information about definitions of differing macros #333).
| return self.value | ||
|
|
||
|
|
||
| class MacroDefinitions: |
There was a problem hiding this comment.
Why is macro collection done using a class while function and types are handled directly by methods of YamlOutput. Are macro somewhat special that the collection must keep some state?
There was a problem hiding this comment.
Why is macro collection done using a class while function and types are handled directly by methods of
YamlOutput. Are macro somewhat special that the collection must keep some state?
The macro collection is done in a separate class mostly because the extraction of the definition is a little bit more complex so I created a class for it so the logic is encapsulated, also it was done in reaction to this comment on the original PR #314 (comment)
| result).output | ||
| # Mocking path.join for the same reason as relpath. | ||
| # Note: This mock can be problem if diffkemp uses >2 paths in join. | ||
| def mock_path_join(path1, *path2): |
There was a problem hiding this comment.
Mocking functions from os.path sounds very dangerous. Are we not able to come up with a better solution?
Yeah, it is probably over-complicated and probably also not very effective. There are probably better solutions, I did it like this to leave most of the logic the same and just at the end extract the necessary information (location of definitions) from places which contains them. Another ideas of how to do this which comes to my mind:
Not sure if all the ideas are implementable, just some options that came to my mind. I will need to more deeply think about this because I am not sure which approach to choose. |
|
Yeah, it seems that there are many ways to do this. From the first glance, option 3 seems like the best but I think that we should have an in-person discussion on this to better understand what the problem is and what the best solution would be. |
This PR contains the last but one part of #314 (remains enhancements of viewer for visualisation of code/diff for
macro-function/function-macrodifferences.The PR adds extracted information about the location of macro definitions to the
diffkemp-out.yamlfile which is created whendiffkemp compareis run. Thanks to the added macro definitions (and thanks to previously merged PRs) it will be now possible to see the code of the macros in the result viewer.