Releases: Pometry/Raphtory
v0.17.0
API Changes
Unified Filter API
The filter system has been completely overhauled. Multiple filter methods (filter_nodes, filter_edges, filter_exploded_edges) are replaced by a single unified filter() method. Filter expressions now require explicit context (filter.Node, filter.Edge, filter.ExplodedEdge). Views now consistently apply "to the right" in the call chain — no more one-hop semantics where filters reset after traversals. Pandas-style [] indexing is also supported for accessing nodes/edges without keeping the filter applied.
# Before (removed)
filtered = graph.filter_nodes(filter.Property("name") == "foo")
filtered = graph.filter_edges(filter.Property("weight") > 0.5)
# After
filtered = graph.filter(filter.Node.property("name") == "foo")
filtered = graph.filter(filter.Edge.property("weight") > 0.5)
filtered = graph[filter.Node.property("active") == True] # Pandas-styleImportant
See the Master filter PR (#2254) for a full migration guide covering Python and GraphQL changes. You can also read about this with the New documentation
Consolidated Load Functions
All format-specific load functions (load_edges_from_pandas, load_edges_from_parquet, etc.) have been removed in favour of unified load_edges(), load_nodes(), load_edge_metadata(), and load_node_metadata() methods that accept any Arrow-compatible data source, file path, or directory.
# Before (removed)
g.load_edges_from_pandas(df, ...)
g.load_edges_from_parquet("data.parquet", ...)
# After — single function, any source
g.load_edges(data=df, time="time", src="src", dst="dst")
g.load_edges(data="data.parquet", time="time", src="src", dst="dst")
g.load_edges(data="/dir/of/csvs/", time="time", src="src", dst="dst")New capabilities include a schema parameter for explicit column type casting, a csv_options parameter for CSV reading configuration (delimiter, quoting, comments, etc.), and support for any Python object implementing the __arrow_c_stream__ interface — including Polars, FireDucks, DuckDB results, and PyArrow Tables — enabling zero-copy streaming into the Rust core. (#2423, #2391)
New NodeState
Nodestate has had a revamp and can now be used to join multple Raphtory outputs together.
New capabilities include:
merge()— combine multiple node states into multi-column results (e.g. merge PageRank with community labels)sort_by(cols)/top_k(cols)— sort or rank by multiple columnsgroups(cols)— group nodes by column valuesto_parquet()/from_parquet()— serialise/deserialise node states
History API
-
All time-returning functions (
earliest_time,latest_time,start,end, etc.) now return anEventTimeobject instead of a raw integer. TheEventTimeobject provides.t(timestamp) and.dt(datetime) accessors, removing the need for separateearliest_datetime,latest_datetime,start_datetime,end_datetimeetc. variants — these have all been removed. -
A new
Historyobject replaces the old plain list return type, providing rich, built-in functionality for working with temporal histories. This includes merging different histories together and exploring intervals. You can read more about this here https://docs.pometry.com/docs/querying/history.
(#2075)
Embedding & Vectorisation API
The embedding API has been significantly reworked. Key breaking changes:
set_embeddings()removed — replaced byvectorise_graph()andvectorise_all_graphs(), which take the newOpenAIEmbeddingsobject directlywith_vectorised_graphs()removed — usevectorise_graph()insteadscoresrenamed todistancesthroughout all APIs (Python, Rust, GraphQL) — similarity search results are now ranked in ascending order of distance rather than descending order of scoreget_documents_with_scores()→get_documents_with_distances()- New classes:
OpenAIEmbeddings,VectorCache, andembedding_server()decorator for custom embedding functions
# Before
server = GraphServer().set_embeddings(cache="/tmp/cache", embedding=my_fn)
selection.get_documents_with_scores()
# After
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
server = GraphServer()
server.vectorise_graph("my_graph", embeddings=embeddings)
selection.get_documents_with_distances()(#2249)
Improvements
- SSSP algorithm is now directed by default (#2426)
- Stable GraphQL schema output for deterministic builds (#2427)
- Local clustering coefficient no longer requires a static graph (#2433)
- Forbid special characters in graph names and namespaces (#2496)
- Filter documentation added (#2472)
- Generate random graphs using the Erdős–Rényi model directly from Raphtory. (#2253)
Bug Fixes
- Fix client errors silently disappearing with
async_graphql7.1.0 (#2439) - Fix GraphQL query cache not evicting automatically when graph data changes (#2454)
- Fix test issues revealed by pandas 3.0 (#2448)
- Fix schema casting issues (#2479)
- Fix auto-release GitHub Action (#2498)
New Contributors
- @DanielLacina made their first contribution in #2253
What's Changed
- Release v0.17.0 by @github-actions[bot] in #2419
- Adding support for loading data from any
__arrow_c_stream__source by @arienandalibi in #2391 - Distinct History Object for nodes and edges by @arienandalibi in #2075
- only trigger autofixing if all tests pass by @ljeub-pometry in #2422
- bring back rust_fmt_check by @ljeub-pometry in #2424
- Made sssp directed by @miratepuffin in #2426
- Make graphql schema output stable by @ljeub-pometry in #2427
- local clustering coefficient relax static graph requirement by @BaCk7 in #2433
- Update UI to v0.2.1 by @miratepuffin in #2431
- update version of python for the docker image by @ricopinazo in #2434
- Master filter by @shivamka1 in #2254
- Prevent client errors from going missing with async_graphql 7.1.0 by @ljeub-pometry in #2439
- Consolidate load functions, add schema for casting, and add CSV options for reading by @arienandalibi in #2423
- impl windowing filter by @shivamka1 in #2441
- Features/graph filter by @shivamka1 in #2446
- Fix test issues revealed by pandas 3.0 by @ljeub-pometry in #2448
- Update UI to 53ab3a3f0 (v0.2.1) by @miratepuffin in #2450
- implemented Erdos-Renyi model generation by @DanielLacina in #2253
- Make UI tests required again by @ljeub-pometry in #2455
- trigger graphql cache eviction automatically by @ricopinazo in #2454
- impl in out components filter by @shivamka1 in #2458
- impl bool filters, add tests by @shivamka1 in #2467
- fixes for docker ci by @ricopinazo in #2470
- merge clis and configs by @ricopinazo in #2457
- impl docs by @shivamka1 in #2472
- fix the schema casting by @ljeub-pometry in #2479
- embedding api improvements by @ricopinazo in #2249
- fix auto release action by @ricopinazo in #2498
- forbid weird characters on graph names or namespaces by @ricopinazo in #2496
- Update UI to c8c6dbdf4 (v0.2.1) by @miratepuffin in #2505
- GenericNodeState by @wyatt-joyner-pometry in #2342
- Features/nodestatefilter by @shivamka1 in #2483
- Update UI to a3d8781b1 (v0.3.0) by @miratepuffin in #2510
- Update UI to 9ddf8253e (v0.3.0) by @miratepuffin in #2514
New Contributors
- @DanielLacina made their first contribution in #2253
Full Changelog: v0.16.4...v0.17.0
v0.16.5
What's Changed
- update version of python for the docker image by @ricopinazo in #2434
- Prevent client errors from going missing with async_graphql 7.1.0 by @ljeub-pometry in #2439
- Fix test issues revealed by pandas 3.0 by @ljeub-pometry in #2448
- trigger graphql cache eviction automatically by @ricopinazo in #2454
- fixes for docker ci by @ricopinazo in #2470
- merge clis and configs by @ricopinazo in #2457
- fix cache and python dockerfile for v16 by @ricopinazo in #2456
- Update v16 by @ljeub-pometry in #2491
- Fix concurrency issue with metadata file and reading of graph metadata by @ljeub-pometry in #2488
- Release v0.16.5 by @github-actions[bot] in #2494
v0.16.4
Release v0.16.4
Highlights
- Raphtory is now be available for Python 3.14.
- We have dropped support for 3.10 to allow Raphtory to be brought up to date with the latest version of PyO3. The minimum Python version is now 3.11.
UI
- Filtered out all non-valid edges in direct connections.
- Added a node type loading indicator to search page to let you know when the graph has finished loading into the cache.
- Added a new layout customiser panel which allows you to change the parameters of the layout algorithm and run multiple layouts as part of a pipeline.
-
Added node and edge styling which can be set in metadata for node types, layers or individual nodes and edges. Styles can be manually written into the graph or added from the UI. Currently, you can adjust the node colour or size and the edge colour in the timeline.
- In the future, style options will expand to everything that our underlying graph visualisation library (g6) offers.
Bug fixes
- Fixed the DataFrame and Arrow loaders now correctly handle different datetime formats (including Date32).
- Dataframe and Arrow loaders now correctly convert strings into timestamps (if possible) when provided as the time column. This brings them in line with the functionality of
add_nodeandadd_edge. - The GraphQL health check now goes through the read and write rayon pools before returning, so deadlocks don't go undetected when using it.
- Added a check on
otlp_agent_hostto confirm the host accepts OpenTelemetry data (otherwise logs the failure to let the user know). - Enabled logs from OTLP libraries so any connection errors are properly logged when debug level is enabled.
- Fixed an issue where the python server would only fail, given wrong arguments, after the timeout completes.
Known issues
What's Changed
This version also adds a docker compose setup under examples/grafana with:
- Raphtory set up to send traces to Tempo.
- Tempo set up to compute TraceQL metrics.
- Grafana set up with tempo as a datasource and a basic dashboard template.
Full tracing for complex queries generates quite large spans so we have added some different tracing levels. Available options are:
- COMPLETE: Provides full traces for each query.
- ESSENTIAL: Tracks key functions — addEdge, addEdges, deleteEdge, graph, updateGraph, addNode, node, nodes, edge, edges.
- MINIMAL: Provides only summary execution times.
v0.16.3
Highlights
Step aligned windows
Rolling and expanding functions have been updated so that the start of each window is aligned with the smallest unit of time passed by the user within the step.
For example, if the step is "1 month and 1 day", the first window will begin at the start of the most recent day. Explicitly, if the earliest time in the graph is 15/01/25 14:02:23 and you call the rolling function you would get the following increments:
Increments in previous versions:
15/01/25 14:02:23 → 16/01/25 14:02:23→ 17/01/25 14:02:23 → 18/01/25 14:02:23 → …
Increments in v0.16.3:
15/01/25 00:00:00 → 16/01/25 00:00:00 → 17/01/25 00:00:00 → 18/01/25 00:00:00→ …
This change was made to make windows more intuitive. If someone wants a rolling window over "1 year", they typically want it to start at the beginning of the calendar year and end at the end of the year. You can also explicitly set the alignment_unit. For example, you can set g.rolling("1 month", alignment_unit="day") if you want to align to the most recent day.
In addition to this change, if rolling or expanding on the 29th, 30th or 31st in monthly increments, you will return to this day if it is present in the next month (or as close as possible). Previously if your date was decremented you would stay at that date:
Increments in previous versions:
31/01/25 → 28/02/25 → 28/03/25 → 28/04/25 → …
Increments in v0.16.3:
31/01/25 → 28/02/25 → 31/03/25 → 30/04/25 → …
Bug fixes
- Previously, the
timeline_startandtimeline_endfallbacks for not explicitly windowed graphs previously looked at the filtered earliest and latest time. This made rolling/expanding inconsistent between different layers. Now when you call rolling or expanding functions on individual layers they will have the same window alignment. - Computing the filtered time has improved performance.
- Significant stress testing added for the server discovered several deadlocks at high concurrency. We rebuilt the locking mechanism in the Graphql server to fix this.
- Fixed panics in case of simultaneous additions and reads (not all nodes were guaranteed to be initialised in iterators).
What's Changed
- temporal vs plain filtering by @jbaross-pometry in #2286
- bump rust version for release action by @ricopinazo in #2298
- add action for docker build cloud by @ricopinazo in #2309
- point to the correct docker path by @ricopinazo in #2310
- prevent graphql bench from complaning about addNodes by @ricopinazo in #2314
- add cache to docker build action by @ricopinazo in #2312
- fix action to build in docker build cloud by @ricopinazo in #2315
- optimise simple temporal intervals by @ljeub-pometry in #2320
- timeline start/end should use global earliest and latest time by @ljeub-pometry in #2319
- remove extra newline in macro docstrings by @jbaross-pometry in #2323
- Deadlock fixes and concurrency configuration from 0.16 by @miratepuffin in #2324
- not all nodes are guaranteed to be initialised in the iterators by @ljeub-pometry in #2325
- Separate thread pools for reading and writing in graphql by @ljeub-pometry in #2326
- Migrate polars-arrow to arrow-rs by @ljeub-pometry in #2316
- Rolling and expanding window alignment based on the user's time interval input by @arienandalibi in #2277
- Refactor test utils by @ljeub-pometry in #2329
- update pometry storage and fix the GID column issue by @fabianmurariu in #2332
- Add ui-tests submodule and newest UI by @louisch in #2305
- make all the main write locks loopy by @fabianmurariu in #2340
- Stress tests by @ricopinazo in #2317
- ingestion options by @jbaross-pometry in #2341
- Explicitly add filter to return types and misc filter stub fixes by @jbaross-pometry in #2330
- Release v0.16.3 by @github-actions[bot] in #2345
New Contributors
- @arienandalibi made their first contribution in #2277
Full Changelog: v0.16.2...v0.16.3
v0.16.2
What's Changed
- Fix explode layers for filtered persistent graph by @ljeub-pometry in #2241
- James/graphql docstrings fixes by @jbaross-pometry in #2239
- James/graphql-userguide-16-x by @jbaross-pometry in #2233
- fix nightly release action by @ricopinazo in #2244
- add docker retag action by @ricopinazo in #2245
- update Slack invite link by @edsherrington in #2252
- Increase sleep time on graphql bench by @ricopinazo in #2278
- Bump tracing-subscriber from 0.3.19 to 0.3.20 in the cargo group across 1 directory by @dependabot[bot] in #2251
- community detection by @jbaross-pometry in #2276
- Use raphtory from python dir by @jbaross-pometry in #2275
- Add EIDS to Node addition by @fabianmurariu in #2279
- Removed last graphql objects with gql in the name by @miratepuffin in #2283
- James/python docstrings by @jbaross-pometry in #2273
- Indexed node additions and moves tests into separate raphtory/tests by @fabianmurariu in #2289
New Contributors
- @edsherrington made their first contribution in #2252
Full Changelog: v0.16.1...v0.16.2
v0.16.1
What's Changed
- Graphql docs main by @jbaross-pometry in #2196
- update release to include raphtory-core by @miratepuffin in #2205
- Batch generate embeddings in
add_nodes/add_edgesby @fabubaker in #2201 - Fix python package CLI by @ricopinazo in #2208
- Test-ci-git-push by @jbaross-pometry in #2206
- add stubs and python linter by @jbaross-pometry in #2207
- Make plugin registry static by @miratepuffin in #2219
- fix deadlock on filtered_edges_iter by @ricopinazo in #2221
- add version function by @miratepuffin in #2220
- Fix/top k by @wyatt-joyner-pometry in #2228
- Fix/fastrp by @wyatt-joyner-pometry in #2229
- fix docker ci by @ricopinazo in #2227
- graphql bench on CI and vector bench by @ricopinazo in #2198
- James/graphql docstrings by @jbaross-pometry in #2210
- Release v0.16.1 by @github-actions[bot] in #2236
Full Changelog: v0.16.0...v0.16.1
v0.16.0
Replace constant properties with metadata
Constant properties have be completely seperated from temporal properties and are now known as metadata. This means that expressions like x.properties.constant should be replaced with x.metadata as in the sample below.
This was done for two reasons:
- The fallback search where
x.properties.get("...")would first check temporal properties and then constant properties was confusing and caused very unexpected behaviour in the filters. - These are quite different concepts and upon reflection we felt that completely seperating them in the API would make it clearer that there isn't any overlap.
You can now have metadata and properties of different types with the same key:
g = PersistentGraph()
node = g.add_node(timestamp=1,id=1,properties={"weight":1})
node.add_metadata(metadata = {"weight":"string weight"})
print(node.metadata.get("weight"))
print(node.properties.get("weight"))Time semantics overhaul
- Seperated explicit node updates from connected edge updates, allowing for better filtering.
- Filtering layers or edges now filters nodes if all the edge updates that added them are filtered out i.e. the node is not added explicitly via
add_node.- As a result, subgraph filters out nodes that don't have edges in the subgraph and were not explicitly added via
add_node.
- As a result, subgraph filters out nodes that don't have edges in the subgraph and were not explicitly added via
- Changed
latest_timesemantics for the PersistentGraph to return the time of the last update for the node, edge, or graph in the current view or the start of the window if there are no updates (previously + Infinity). - The
earliest_timeandlatest_timewithin a filtered Event Graph will now reflect the updates within the graph view instead of just window bounds. - Added a
Graph.valid()filter that only keeps edges that are currently valid without removing their history. - For a PersistentGraph
is_validandis_activeare no longer the same.- Active means there is an update during the period (addition or deletion).
- Valid means that the edges most recent update is an addition (persistent semantics).
- Deleted means that the edges most recent update is a deletion.
- The event graph preserves deletions if created from a persistent graph. An edge can have the following statuses:
- Included - is active in the window (has an addition or deletion event).
- Valid - has an addition event in the current view.
- Deleted - has an addition event in the current view.
- The default layer only exists if it has updates on it.
- Filtering an edge update on a persistent graph turns it into a deletion to keep the semantics sensible.
New APIs
- Edge filtering and exploded edge filtering is now available on the PersistentGraph.
- Enabled filter negation within the property filter APIs.
filter_exploded_edgesnow takeFilterExpras input in Python.- The old
Prop("name")api has been removed, usefilter.Property("name")instead.
- The old
- Added node filters to PathFromNode and PathFromGraph.
- Added
edge_history_count()to the nodes API.
GraphQL server
- Drastically improved the performance of the server - over 100 times faster within internal benchmarks.
- Enabled compression by default.
- Changed the Python client to only have one internal client instead of creating one for each query, resulting in 100x faster querying from Python.
- Added rolling and expanding to Graph, Node, Nodes, PathFromNode, Edge and Edges.
- Renamed all GraphQL structs that started with GQL to make the user facing schema cleaner.
- Changed all page endpoints to have two separate arguments for item-based and page-based offsets. The existing offset argument has been changed to be item-based, and a separate page_index argument has been added for the old page-based behavior. Both can also be used simultaneously.
- Added a new API for fetching both namespaces and graphs at the same time.
- The new object is called a NamespacedItem.
- Added apply_views to PathFromNode.
- You can now generate the GraphQL schema in Raphtory via the new CLI.
- You can run
raphtory-graphql schema > schema.graphqlremoving the need to run a server.
- You can run
- You can now insert a custom UI into your custom Raphtory builds via a environment variable.
- Exposed the GraphQL schema in Python - can now be printed via
raphtory.graphql.schema()
GraphQL Bug fixes
- Fixed GraphQL signed integer fields not accepting negative numbers.
- Fixed a problem with namespaces returning null paths and not returning root.
- Fixed an issue with recursive writing of indexes causing the server to crash.
- Fixed an issue in rolling where if the step was bigger than the window size the final window would be empty.
- Changed caching policy to never kick out graphs after some timeout by default.
- Changed WindowSet to not allow zero size step.
- Added validation to edge and node filters to ensure the property type matches the given value.
Raphtory CLI
- Adding a Raphtory CLI which is installed via Python where you can start the server or print the schema.
UI
Temporal View
- Scrolling has been drastically improved so that hovering over the bar behaves nicely.
- Added the ability to pin nodes in the Temporal view to keep them at the top.
- Nodes now are highlighted in the Temporal view when selected in the graph. The old behaviour of filtering only to edges between highlighted nodes is togglable from the bottom right of the Temporal view.
- The bucketing of edges is now fixed.
Graph view
- Fixed visual artifacts when swapping between highlighting.
- Highlighting relationship types now highlights the edges correctly.
- The activity log and direct connections in the Context menu are now sorted correctly.
Search page
- Added relationship searching.
- Added namespace searching.
- Clarified that timeline filtering is optional.
- Fixed the filters so that comparisons, like 'greater than' or 'less than', work.
- String searching now can do partial matching.
Saved graphs page
- Minor bug fixes and UX improvements.
GraphRag
- Swapped our default embedded vector store from a homebrewed solution to Arroy.
- Add an argument to the vectorise function so that the user can set a path for storing there the vector cache.
- Added support for missing apis on the template:
- access to constant_properties
- temporal_properties.
Property Indexes Alpha
- Indexes in Raphtory are now updatable and produce the same answer as the filter APIs. They can be saved to disk alongside the proto file and loaded back into memory via Rust, Python or a GraphQL server.
- Indexes are turned off by default, but can be enabled for for the whole graph, or individual properties via
Graph.create_index().
Python
- Removed unneeded Python dependencies and make those that are not needed for core functions optional.
- Relaxed the Numpy version to 1.26.
General Bug fixes
- Fixed filter_edges for layers after adding a constant property.
- Fixed a bug in the interaction between windowing and exploded edge filtering.
- Fixed parquet reader where Utf8View columns were being converted to LargeUtf8 which was causing problems further downt the pipeline.
- Fixed some issues with decoding updates from proto between different versions of Raphtory.
What's Changed
- Added NodeStateStringF64 by @david-mrn in #2034
- Temporal View Fixes in UI by @rachchan in #2033
- Fix/Python tests by @ljeub-pometry in #2092
- add utf8view support for proptype conversion from arrow datatype by @wyatt-joyner-pometry in #2094
- Replace existing filters by @shivamka1 in #1991
- fix the benchmark permissions so they can be submitted to github pages by @ljeub-pometry in #2097
- fix the lock file by @ljeub-pometry in #2096
- Tests/disk graph by @shivamka1 in #2099
- GraphQL refactor + rolling/expanding by @miratepuffin in #2090
- arroy for vectors by @ricopinazo in #2074
- impl index spec by @shivamka1 in #2103
- Time semantics overhaul by @ljeub-pometry in #1969
- impl gql path filter, add tests by @shivamka1 in #2117
- impl gql index_spec by @shivamka1 in #2116
- Bump requests from 2.32.3 to 2.32.4 in /docs in the pip group across 1 directory by @dependabot[bot] in #2122
- fix for metadata disk graphs by @rachchan in #2114
- fixes filter_edges for layers is broken after add_constant_properties by @shivamka1 in #2123
- Features/gql filters by @shivamka1 in #2126
- Get number of edge updates for a node by @ljeub-pometry in #2125
- Rayon executor for GraphQL by @ljeub-pometry in #2128
- Enable edge filtering on PersistentGraph by @ljeub-pometry in #2137
- Expose valid edge filter in Python and GraphQL by @ljeub-pometry in <https://github.com/Pometry/R...
v0.15.1
Graphql
- Added new option to output the graphql schema without running the server via
raphtory-graphql schema > schema.graphql - Graphql now accepts signed integers (bug with underlying library that we patched)
- Created gqldocuments + output nodes and edges as well as gqldocument in that object -- for vector search
- You can now provide a custom UI as part of a private raphtory server.
misc
- Removed dependency on numpy 2.0, will now install/run with <2
- Several library upgrades for CVE reasons.
- Improved python testing pipeline
What's Changed
- enable setting up custom ui through env variable by @ricopinazo in #2000
- Fix reading of Utf8View columns in parquet reader by @ljeub-pometry in #2003
- Output nodes and edges in similarity search by @rachchan in #1975
- Fix/utf8view by @ljeub-pometry in #2005
- Update python dependencies and testing by @ljeub-pometry in #2021
- add as_ref to NodeView by @ljeub-pometry in #2024
- add option to output graphql schema by @ricopinazo in #2023
- update-ui-db132d339 by @miratepuffin in #2029
- Fix security and deps by @miratepuffin in #2025
- Use fixed dynamic_graphql and up rust version to 1.86 by @louisch in #2020
- add patchelf for docs by @miratepuffin in #2032
- Release v0.15.1 by @github-actions in #2031
Full Changelog: v0.15.0...v0.15.1
v0.15.0
API and Model changes
Property changes for Graph to Parquet
As part of our work to unify the in-memory and on-disk storage models of Raphtory and allow us to save directly to formats such as arrow and parquet we have had to make several changes to the model. These include:
- Restricting Map properties such that for each instance of the map in a history, each key has the same property type.
- Restrict List properties such that the values must be the same type.
- Removing Graphs and PersistentGraph properties.
Through this you can now save to/load from parquet via to_parquet and from_parquet. Once we have improved this slightly and added the ability to stream updates in, we will be deprecating the proto format for saving and moving fully to parquet. This is because loading from proto is using a huge amount of memory and is quite slow.
If any of these changes affect your use case, please reach out and we can assist.
Algorithm Result replaced with NodeState
One of the major roadmap objectives for Raphtory is to standardise all outputs as either a NodeState or EdgeState. These dataframe like structures make post-processing significantly easier and as more functionality is added will allow more complicated pipelines to be optimised automatically by Raphtory, instead of an having to swap over to writing a function in rust.
As part of this release we have replaced all instances of AlgorithmResult with NodeState an example of which can be seen below with Pagerank.

These NodeState objects are indexable and have all of the same functionality perviously available in the AlgorithmResult.

The only notable change is Group_by has been renamed to groups as there is only one value to group on. This returns a NodeGroups which is also indexable:

Fixing Persistent Graph semantics
- Changed the semantics for edge deletions without a corresponding addition so that they are only considered as an instantaneous event (the edge does not exist before or after)
- Fixed bug where property values for exploded edges were incorrect for the PersistentGraph
- Cleaned up semantics for earliest and latest time on edges accordingly
- Multiple updates at the start of the window are now handled properly
- No more spurious exploded edges if there is an update at the start of the window
Smaller changes/fixes
- Fixed an issue where
containsandkeyswere giving inconsistent results for edge properties, leading to a panic
g = Graph()
g.add_edge(0, 1, 2, layer="a")
g.add_edge(0, 1, 2)
g.edge(1, 2).add_constant_properties({"test": 1})
constant_exploded = g.layer("a").edges.explode().properties.constant.values() # used to panic here!- Unified the logic between
update_constant_propertiesandadd_constant_propertieson edges to make sure that the edge actually exists in the layer that the constant properties are being added to. - Alongside this unification, if an edge has no temporal updates for one of its layers within a given window, it will now be correctly filtered out of the view - this was previously not happening if that layer had constant properties.
- Fixed a bug where adding empty temporal updates to graph properties incorrectly affected the earliest/latest time
- Removed the get_by_id function on Properties - this was nonsense and is now only available on temporal and constant properties individually.
rollingandexpandingcan now accept Interval directly instead of complaining about incompatible Error types in the conversion- Fixed a bug where the const properties for edges did not align with the values.
- Materialising and empty graph view now preserves the layer information.
- Fixes bug where loading from DataFrame would miss adding edges to the layer adjacency lists
Graphql
Apply views
It can be quite annoying to parse the response from a Raphtory server when you have a use case where nested views are changed arbitrarily, altering the depth of results. As such we have added a new function applyViews which allows you to batch in a singular call. This function is available on the Graph, GQLNodes, GQLEdges, Edge and Node.
An example of this can be seen below where we apply excludeNodes, before, layers and edgeFilter and then get the properties of exploded edges - in the first screenshot (how you would currently do this) the edges appear 6 objects deep, which would change if we removed one of these filters. In the second screenshot the edges are 3 objects deep and this won't change if we add or remove filters. The results will otherwise be the same.

Sorting in Graphql
Unlike in python or rust where it is easy to sort the edge/node iterators on anything you like, in graphql this was not possible. This meant a lot more client side processing and made it impossible to page results if you want them sorted by say earliest time.
As such we have added a sorting functionality to GqlNodes and GqlEdges which allow you to order by time, property value and id (or a prioritised combination of these) before paging/listing. An example of this can be seen below where we are sorting nodes first by a property and then by the latest time.

Namespaces and Graph metadata
We have added a new namespace API in graphql which allows you to easily explore the graphs which are present within each path, and explode the childen and parent of each namespace. This will replace the GQLgraphs api which will be deprecated.

Calling the graph function within a namespace will return a new MetaGraph object which allows you to query information about that graph without loading it - notably the node/edge count, when it was created, and when it was last edited/accessed.
This information is being stored inside the .raph file which will be automatically updated for any graphs you have saved from <0.15.0.
Read write permissions via JWT
We have added a JWT bearer auth layer on top of Raphtory. It does it by using an EdDSA public key, which makes the server responsibility boil down to only two things:
- Correctly validating JWTs.
- Allowing access only to those resources stated in the JWT.
The responsibility for preventing a secret leakage is out of the equation since Raphtory doesn't have access to the private key, responsible for encoding JWTs.
Currently we are using this to specify if users can read (accessing all graphs) or write (able to modify all graphs). However, in future versions this will be used to limit users to specific namespaces and possibly information within each graph.
Other changes
- Changed anywhere that was returning a list of Nodes or list of Edges to GQLNodes and GQLEdges respectively. This is so all output can be correctly paged. If you notice anywhere that is not the case, please do raise an issue.
- The in- and out-components were not applying the one-hop filter resetting correctly - the GQLNodes which are returned will now return back to the graph filter and can be layered/windowed differently than the node which in/out-components was called on.
- Addded an option ids argument to nodes query in GraphQL for getting a subset of the nodes without having to reduce the graph via subgraph.
- Added a new mutation
create_subgraphwhich we use to allow saving of graph views in the open source UI. - Removed the ability to create
RemoteEdgeandRemoteNodedirectly in python, this should now only be able to be grabbed from aRemoteGraph - Fix a bug causing NaN float to panic when querying through GraphQL
- Change the schema queries so it doesn't eagerly iterate over all nodes in the graph - if the variants for a property are >100, this will return an empty list to reduce computation.
Algorithms
- The docstrings, method signatures, and return types of many of the algorithms have been standardised as part of the swap to Nodestate from AlgorithmResult
- Fix the order in which nodes are considered in the in- and out-component algorithm so the calculated distances are correct.
- Added integer support to balance algorithm - Previously, edge properties had to be converted to floats. Now ints and floats both work as expected.
- 'clustering_coefficient' is renamed to 'global_clustering_coefficient'. All of the clustering coefficient variants have been moved to a submodule of 'metrics' called 'clustering_coefficient'. It was previously extremely inefficient to run LCC on a group of nodes.
- The new batch version should do a better job of parallelizing the process and reducing overhead.
- Remove inefficient early-culling code from SCC implementation
- The SCC implementation featured a block of code in the beginning which exhaustively checked which nodes belong to a strongly connected component by performing a BFS search and checking if the source node is reachable from itself. In the way this is implemented, this is entirely redundant to the process of just executing Tarjan's SCC algorithm, which it already subsequently executes.
Documentation
- We have added a huge amount of documentation to python and graphql alon...
0.15-beta
API and Model changes
Property changes for Graph to Parquet
As part of our work to unify the in-memory and on-disk storage models of Raphtory and allow us to save directly to formats such as arrow and parquet we have had to make several changes to the model. These include:
- Restricting Map properties such that for each instance of the map in a history, each key has the same property type.
- Restrict List properties such that the values must be the same type.
- Removing Graphs and PersistentGraph properties.
Through this you can now save to/load from parquet via to_parquet and from_parquet. Once we have improved this slightly and added the ability to stream updates in, we will be deprecating the proto format for saving and moving fully to parquet. This is because loading from proto is using a huge amount of memory and is quite slow.
If any of these changes affect your use case, please reach out and we can assist.
Algorithm Result replaced with NodeState
One of the major roadmap objectives for Raphtory is to standardise all outputs as either a NodeState or EdgeState. These dataframe like structures make post-processing significantly easier and as more functionality is added will allow more complicated pipelines to be optimised automatically by Raphtory, instead of an having to swap over to writing a function in rust.
As part of this release we have replaced all instances of AlgorithmResult with NodeState an example of which can be seen below with Pagerank.

These NodeState objects are indexable and have all of the same functionality perviously available in the AlgorithmResult.

The only notable change is Group_by has been renamed to groups as there is only one value to group on. This returns a NodeGroups which is also indexable:

Fixing Persistent Graph semantics
- Changed the semantics for edge deletions without a corresponding addition so that they are only considered as an instantaneous event (the edge does not exist before or after)
- Fixed bug where property values for exploded edges were incorrect for the PersistentGraph
- Cleaned up semantics for earliest and latest time on edges accordingly
- Multiple updates at the start of the window are now handled properly
- No more spurious exploded edges if there is an update at the start of the window
Smaller changes/fixes
- Fixed an issue where
containsandkeyswere giving inconsistent results for edge properties, leading to a panic
g = Graph()
g.add_edge(0, 1, 2, layer="a")
g.add_edge(0, 1, 2)
g.edge(1, 2).add_constant_properties({"test": 1})
constant_exploded = g.layer("a").edges.explode().properties.constant.values() # used to panic here!- Unified the logic between
update_constant_propertiesandadd_constant_propertieson edges to make sure that the edge actually exists in the layer that the constant properties are being added to. - Alongside this unification, if an edge has no temporal updates for one of its layers within a given window, it will now be correctly filtered out of the view - this was previously not happening if that layer had constant properties.
- Fixed a bug where adding empty temporal updates to graph properties incorrectly affected the earliest/latest time
- Removed the get_by_id function on Properties - this was nonsense and is now only available on temporal and constant properties individually.
rollingandexpandingcan now accept Interval directly instead of complaining about incompatible Error types in the conversion- Fixed a bug where the const properties for edges did not align with the values.
- Materialising and empty graph view now preserves the layer information.
- Fixes bug where loading from DataFrame would miss adding edges to the layer adjacency lists
Graphql
Apply views
It can be quite annoying to parse the response from a Raphtory server when you have a use case where nested views are changed arbitrarily, altering the depth of results. As such we have added a new function applyViews which allows you to batch in a singular call. This function is available on the Graph, GQLNodes, GQLEdges, Edge and Node.
An example of this can be seen below where we apply excludeNodes, before, layers and edgeFilter and then get the properties of exploded edges - in the first screenshot (how you would currently do this) the edges appear 6 objects deep, which would change if we removed one of these filters. In the second screenshot the edges are 3 objects deep and this won't change if we add or remove filters. The results will otherwise be the same.

Sorting in Graphql
Unlike in python or rust where it is easy to sort the edge/node iterators on anything you like, in graphql this was not possible. This meant a lot more client side processing and made it impossible to page results if you want them sorted by say earliest time.
As such we have added a sorting functionality to GqlNodes and GqlEdges which allow you to order by time, property value and id (or a prioritised combination of these) before paging/listing. An example of this can be seen below where we are sorting nodes first by a property and then by the latest time.

Other changes
- Changed anywhere that was returning a list of Nodes or list of Edges to GQLNodes and GQLEdges respectively. This is so all output can be correctly paged. If you notice anywhere that is not the case, please do raise an issue.
- The in- and out-components were not applying the one-hop filter resetting correctly - the GQLNodes which are returned will now return back to the graph filter and can be layered/windowed differently than the node which in/out-components was called on.
- Addded an option ids argument to nodes query in GraphQL for getting a subset of the nodes without having to reduce the graph via subgraph.
- Added a new mutation
create_subgraphwhich we use to allow saving of graph views in the open source UI. - Removed the ability to create
RemoteEdgeandRemoteNodedirectly in python, this should now only be able to be grabbed from aRemoteGraph - Fix a bug causing NaN float to panic when querying through GraphQL
- Change the schema queries so it doesn't eagerly iterate over all nodes in the graph - if the variants for a property are >100, this will return an empty list to reduce computation.
Algorithms
- The docstrings, method signatures, and return types of many of the algorithms have been standardised as part of the swap to Nodestate from AlgorithmResult
- Fix the order in which nodes are considered in the in- and out-component algorithm so the calculated distances are correct.
- Added integer support to balance algorithm - Previously, edge properties had to be converted to floats. Now ints and floats both work as expected.
- 'clustering_coefficient' is renamed to 'global_clustering_coefficient'. All of the clustering coefficient variants have been moved to a submodule of 'metrics' called 'clustering_coefficient'. It was previously extremely inefficient to run LCC on a group of nodes.
- The new batch version should do a better job of parallelizing the process and reducing overhead.
- Remove inefficient early-culling code from SCC implementation
- The SCC implementation featured a block of code in the beginning which exhaustively checked which nodes belong to a strongly connected component by performing a BFS search and checking if the source node is reachable from itself. In the way this is implemented, this is entirely redundant to the process of just executing Tarjan's SCC algorithm, which it already subsequently executes.
Documentation
- We have added a huge amount of documentation to python and graphql alongside improvements to the stub generator to let us know what is missing. There are currently screaming warning everywhere as there is still a lot to add, but should make it much easier to manage this moving forward.
- We have turned the stub generator into a python package that can be installed for use with other projects - This will probably be released to pypi soon.
Vector APIs
- Added default document templates as having default templates is a first step towards a smart search view on the open source UI.
- Update vector API (on the server as well) to allow choosing between using the default template, a custom one, or nothing at all, for each of the three types of entities
- Fixed a bug causing subgraphs to allow containing the same node more than once
- Reviewed public API to stick to temporal_props / constant_props naming convention
Optimisations and misc
- Started work on several known issues when iterating over edges - still much to do, but should be noticeably faster now.
- Calling edges on a subgraph should no longer iterate over all edges in the entire graph to apply the subgraph filter.
- Now Using DoubleEndedIterator for last value in node temporal properties.
- Fix the optimisation that checks if the window is actually a constraint to look at the underlying storage, not the wrapped view (which is both potentially slow and incorrect). This increases performance notably for nested windows.
- Fixed GIL deadlock when ...

