wendao-datascience

wendao-datascience is an LLM-oriented datascience facade for Wendao query and get outputs.

It exists to keep one workflow simple:

normalize Wendao payloads, rows, or Arrow tables into one stable dataset object
expose Arrow and Polars views over that dataset
summarize the dataset in a form that an LLM can consume quickly
help an LLM write one script or several Python scripts that become a strong analyzer

This package does not own Wendao transport. wendao-core-lib and wendao-arrow-interface remain the transport and session-facing layers. This package starts after the Wendao data is already materialized.

Upstream Pins

The package pins wendao-core-lib and wendao-arrow-interface through [tool.uv.sources] so both packages resolve from the same upstream Git revision over https://github.com/tao3k/xiuxian-artisan-workshop.

Quick Start

uv sync
uv run pytest
uv run python examples/scripted_repo_search_first_implementation.py

from wendao_datascience import WendaoDataset

payload = {
    "rows": [
        {"doc_id": "doc-1", "language": "python", "score": 0.91},
        {"doc_id": "doc-2", "language": "rust", "score": 0.77},
    ]
}

dataset = WendaoDataset.from_query_payload(payload, route="/search/repos/main")
frame = dataset.to_polars()
profile = dataset.profile()
request = dataset.build_script_request("Summarize score distribution by language")

print(frame)
print(profile.to_markdown())
print(request.prompt)

Documentation

Primary package positioning and the LLM-facing goal are documented in docs/llm_analyzer_mission.md.

The first concrete implementation is examples/scripted_repo_search_first_implementation.py, which turns one WendaoArrowSession repo-search result into:

one WendaoDataset
one repo-search overview
one LLM-ready script request

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
examples		examples
nix		nix
src/wendao_datascience		src/wendao_datascience
tests		tests
.envrc		.envrc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
devenv.lock		devenv.lock
devenv.nix		devenv.nix
devenv.yaml		devenv.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wendao-datascience

Upstream Pins

Quick Start

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

wendao-datascience

Upstream Pins

Quick Start

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages