Native Julia and Modelica parser services for Wendao, exposed over Arrow Flight.
This package owns:
- Julia parsing through
JuliaSyntax.jl - Modelica parsing through
OMParser.jl - AST-query-level search under
src/search/ - normalization into a stable native Arrow Wendao parser contract
- Arrow Flight request and response helpers built on
WendaoArrow
Package boundary:
WendaoCodeParser.jlis parser-side only: it owns native parsing, normalization, and parser-route service behavior- it does not own Rust-side or host-side Flight client linkage
- client linkage belongs in the Rust search, graph, and runtime integration layers that consume these parser routes
The initial slice keeps the Rust client cutover out of scope and proves the provider contract first.
WendaoSearch.jl can also mount these parser routes into its existing live gRPC
service with --code-parser-route-names, so the same Arrow Flight process can
serve both graph-search routes and AST-query routes during local loopback tests.
Package docs now also live under docs/:
Current backend status:
- Julia summary and AST-query routes are implemented with
JuliaSyntax.jl - Modelica summary and AST-query routes are implemented with
OMParser.jl - When the upstream
Pkg.build("OMParser")path is broken on macOS Julia 1.12, this package falls back to the official upstream source-build flowlib/parser -> autoconf -> ./configure -> make - The current workspace lock pins
OMParser.jltohttps://github.com/tao3k/OMParser.jlatcebc0696407385e52496608fcc13e95a556da3b5until the bootstrap fixes are consumed upstream - The current workspace lock pins
WendaoArrow.jltohttps://github.com/tao3k/WendaoArrow.jl.gitat334615136a8b68f18eedc614e0cc5ad33494ecc8instead of a local sibling path, so package resolution and GitHub Actions use the same Arrow transport revision
Native bridge note:
OMParser.jlstill uses a native parse bridge that resolvesAbsyn,ImmutableList, andMetaModelicafromMainduring parser initializationWendaoCodeParser.jltherefore aliases those already-loaded modules intoMainbefore the first Modelica parse, especially for mounted live-child startup underWendaoSearch.jl- This runtime requirement is separate from the upstream
OMParser.jlbuild/bootstrap lane: the upstream PR still matters forPkg.build(...), release assets, and CI coverage, but it does not by itself close the live child startup contract
Current route surface:
julia_file_summaryjulia_root_summarymodelica_file_summaryjulia_ast_querymodelica_ast_query
Parser layout note:
src/parsers/julia/mod.jlre-exports the Julia parser owner surface- Julia parsing is split across focused files instead of a monolithic parser source
src/parsers/julia/state.jlowns Julia collection-state containerssrc/parsers/julia/summary.jlowns Julia parser responses and summary-item shapingsrc/parsers/julia/emit.jlowns Julia summary and AST row emission helperssrc/parsers/julia/syntax.jlownsJuliaSyntax.SyntaxNodeinspection, naming, and line-span helperssrc/parsers/julia/types.jlowns parser-native Julia type-header extraction such as type parameters, supertypes, and primitive bit widthssrc/parsers/julia/dependencies.jlowns Juliaimport,using,include, and shared dependency normalizationsrc/parsers/julia/functions.jlowns parser-native Julia function-header extraction such as arity, varargs,whereclauses, and return annotationssrc/parsers/julia/collect.jlowns SyntaxNode traversal and Julia summary or AST state collectionsrc/parsers/modelica/backend.jlnow only owns theOMParser.jlnative bridge and shared-library/runtime bootstrapsrc/parsers/modelica/nodes.jlowns generic Modelica AST node materializationsrc/parsers/modelica/dependencies.jlowns Modelicaimport/extendsemission plus shared dependency summary shapingsrc/parsers/modelica/collect.jlowns Modelica state collection, cache lifecycle, expression normalization, and non-dependency AST traversalsrc/parsers/modelica/summary.jlowns Modelica summary shaping over the collected statesrc/search/consumes those language-owned state collectors instead of reimplementing parser traversal logicsrc/search/query/now splits AST query parsing, node filtering, and match projection into focused files, so parser-side search semantics do not grow back into one flat query source
Contract note:
- schema version remains
v3in the current workspace; additive native Arrow column expansion does not advance the published version until the next GitHub release cut - summary responses emit typed
summary_item_rowsplus stable scalar columns such asmodule_name,module_kind,class_name, andrestriction, plus nullable detail columns such asitem_visibility,item_owner_name,item_owner_kind,item_owner_path,item_root_module_name,item_module_name,item_module_path,item_class_path,item_target_name,item_target_path,item_top_level,item_line_start,item_line_end,item_dependency_kind,item_dependency_form,item_dependency_target,item_dependency_is_relative,item_dependency_relative_level,item_dependency_local_name,item_dependency_parent,item_dependency_member,item_dependency_alias,item_binding_kind,item_type_kind,item_type_parameters,item_type_supertype,item_primitive_bits,item_is_partial,item_is_encapsulated,item_component_kind,item_array_dimensions,item_default_value,item_start_value,item_modifier_names,item_unit,item_function_positional_arity,item_function_keyword_arity,item_function_has_varargs,item_function_where_params, anditem_function_return_type - AST requests carry typed query columns such as
node_kind,name_equals,name_contains,text_contains,signature_contains,attribute_key,attribute_equals,attribute_contains, andlimit; there is noquery_jsonfallback - AST responses return one
ast_match_rowsArrow row per match, with fields such asmatch_index,match_node_kind,match_name,match_text,match_signature,match_target_kind,match_target_name,match_module,match_path,match_module_kind,match_dependency_kind,match_dependency_form,match_dependency_target,match_dependency_is_relative,match_dependency_relative_level,match_dependency_local_name,match_dependency_parent,match_dependency_member,match_dependency_alias,match_owner_name,match_owner_kind,match_owner_path,match_root_module_name,match_module_name,match_module_path,match_class_path,match_target_path,match_top_level,match_binding_kind,match_type_kind,match_type_parameters,match_type_supertype,match_primitive_bits,match_function_positional_arity,match_function_keyword_arity,match_function_has_varargs,match_function_where_params,match_function_return_type,match_array_dimensions,match_start_value,match_modifier_names, line spans,match_attribute_key, andmatch_attribute_value - Julia summary and AST collection now run directly on
JuliaSyntax.SyntaxNode, including@reexport using, module docstrings, symbol docstrings, first-line function signatures, top-levelconstorglobalbindings, macro definitions, explicitmodule_kindnormalization formoduleversusbaremodule, explicittype_kindnormalization forstruct,mutable_struct,abstract_type, andprimitive_type, and source-accurate line spans - Julia type rows now also expose parser-owned type-header detail such as
type_parameters,type_supertype, andprimitive_bits, so generic and primitive declaration structure stays queryable without re-parsing the signature string on the search side - Julia function rows now also expose parser-owned function-header detail
from
JuliaSyntax.jl, including positional arity, keyword arity, varargs,whereparameters, return annotations, positional parameter names, keyword parameter names, defaulted parameter names, typed parameter names, and explicit positional or keyword vararg names - same-scope Julia function methods are now preserved as distinct summary and
AST rows instead of being collapsed by short name; consumers can
disambiguate methods with parser-owned
signatureplus source span, whilepathexposes the stable base symbol path - Julia parameter rows now also expose one parser-owned summary item and AST
node per parameter, including
parameter_kind,parameter_type_name,parameter_default_value,parameter_is_typed,parameter_is_defaulted,parameter_is_vararg, plus method-leveltarget_pathso overloaded methods can be audited without flattening - Julia docstring rows now separate the doc-literal span
(
line_start/line_end) from the bound declaration span (target_line_start/target_line_end), so search consumers do not need to guess which semantic the parser encoded - Julia dependency rows now preserve parser-owned dependency semantics
through shared
dependency_kind,dependency_target,dependency_is_relative,dependency_relative_level,dependency_local_name,dependency_parent,dependency_member, anddependency_aliasfields, soimport,using, andincludecan be queried through one normalized dependency contract without replacing the existing language-native groups - Julia dependency rows now also preserve parser-owned local binding names
such as
rd,DataFrame,BT,Utils, andfoo, so search can query the name that becomes visible in the current scope instead of re-deriving it from alias, member, or target strings - Julia dependency rows now also preserve parser-owned syntax forms such as
path,member,alias,aliased_member, andinclude, so search can distinguish direct imports, selective imports, aliased imports, and include edges without re-parsing the dependency target - Julia relative dependency rows now also preserve parser-owned leading-dot
semantics such as
using ..Parent: foo,import .Utils, andimport ..Core: bar as baz, so search can query relative imports without inferring dot depth from raw dependency strings - Modelica documentation comments are normalized before they become summary
items or AST nodes, so clients receive semantic content instead of raw
//,/*, or*lexer markers - the Modelica backend now keeps a bounded same-source parse-state cache in
the service process, keyed by exact
source_idplussource_text, so repeated AST queries do not callOMParseragain for unchanged input - Modelica native summary rows now expose parser-side visibility, owner, type-name, qualifier, equation, component-kind, array-dimension, default-value, modifier-name, start-value, and unit detail without widening into any Rust-side adapter work
- Modelica
importandextendsrows now also expose the same shareddependency_kind,dependency_target, and named-importdependency_aliasdetail used by Julia dependency rows, so parser-side search can audit dependency semantics without re-deriving language-specific targets insrc/search/ - Modelica import rows now also expose parser-owned
dependency_local_namesuch asSIandMath, so the local binding visible in Modelica scope is queryable over the native Arrow contract instead of being inferred from target-path leaf segments - Modelica dependency rows now also expose parser-owned syntax forms such as
named_import,qualified_import,unqualified_import, andextends, so native search can distinguish imported binding shapes without reconstructing the Modelica source string on the host side - Modelica qualified and unqualified imports now remain distinct parser
rows even when they target the same module path inside one class scope, so
search does not collapse
import Modelica.Math;andimport Modelica.Math.*;into one dependency row - Modelica named imports now also expose parser-owned
dependency_aliasdetail, while grouped imports now fail as deterministic parser-owned errors before the native bridge instead of aborting the service process - AST search now resolves
attribute_keyagainst parser-owned top-level node fields first and then against parser-ownedmetadata, so search can reuse native provider detail instead of inventing parallel search-only schema - parser-native AST search now treats identifier-list fields such as
function_keyword_params,function_defaulted_params,function_typed_params,function_positional_params, andmodifier_namesas parser-owned list semantics during attribute matching, soattribute_equalscan match one list member andmatch_attribute_valuereports the exact matched member instead of the whole serialized field - parser-native AST search now also treats parser-owned boolean and integer
fields as typed scalars during
attribute_equals, so values such asfunction_has_varargs = true,function_positional_arity = 4,dependency_relative_level = 2,is_partial = true, orline_start = 2are matched by native scalar equality instead of weak stringification;attribute_containsremains textual or identifier-list specific - parser-native AST search can now query Julia attributes such as
reexported,target_kind,target_name,target_line_start,target_line_end,root_module_name,top_level,module_name,module_path,owner_name,owner_kind,owner_path,path,dependency_kind,dependency_form,dependency_target,dependency_is_relative,dependency_relative_level,dependency_local_name,dependency_parent,dependency_member,dependency_alias,type_parameters,type_supertype,primitive_bits,function_positional_arity,function_keyword_arity,function_has_varargs,function_where_params, orfunction_return_type,function_positional_params,function_keyword_params,function_defaulted_params,function_typed_params,function_positional_vararg_name, orfunction_keyword_vararg_name,parameter_kind,parameter_type_name,parameter_default_value,parameter_is_typed,parameter_is_defaulted, orparameter_is_vararg, and Modelica attributes such asowner_name,owner_path,class_path,top_level,dependency_kind,dependency_form,dependency_target,dependency_local_name,dependency_alias,visibility,type_name,variability,direction,component_kind,array_dimensions,default_value,start_value,modifier_names,unit,restriction,is_partial,is_final, oris_encapsulated - scoped parser ownership now participates in dedup: repeated short names in different Julia modules or different Modelica class scopes are preserved as distinct AST nodes instead of being collapsed globally
- package tests are now split under
test/support/andtest/cases/, sotest/runtests.jlstays as a small runner instead of a monolithic file - parser-specific Flight round-trip coverage is now isolated in
test/cases/flight_native_columns.jl, and mounted shared-service parser regressions are isolated underWendaoSearch.jl/test/integration/, includinglive_code_parser.jl,live_dependency_semantics.jl,live_relative_dependencies.jl,live_modelica_import_forms.jl, andlive_julia_type_headers.jl - AST match rows now also promote parser-owned stable columns such as
match_target_name,match_root_module_name,match_top_level,match_reexported,match_visibility,match_type_name,match_variability,match_direction,match_component_kind,match_default_value,match_unit,match_is_partial,match_is_final, andmatch_is_encapsulated, so mounted consumers do not have to recover those semantics only throughmatch_attribute_key/match_attribute_value - Julia scope-owned rows now also propagate parser-owned
top_levelsemantics through summary and AST rows, so root-module declarations and nested-module declarations stay distinguishable without reconstructing scope only fromowner_pathormodule_path - Julia parameter rows now also expose parser-owned
owner_signaturedetail through summary and AST rows, so overloaded-method parameter searches can disambiguate method ownership without relying only on synthesizedtarget_path
GitHub Actions note:
- package-local CI now runs
Pkg.build()plusPkg.test()onubuntu-latestandmacos-latestfor Julia1.12andpre - a separate nightly workflow runs weekly on
ubuntu-latest - both workflows bootstrap
GeneralplusOpenModelicaRegistrybefore build and test, so remote runners do not depend on preinstalled registries