Skip to content

Add missing array/list functions and aliases #1452

@timsaucer

Description

@timsaucer

Summary

Several array/list functions from upstream DataFusion are not yet exposed in datafusion-python. This includes new functions and missing list_* aliases for existing array_* functions.

Missing Functions (new)

  • array_any_value / list_any_value — returns any non-null element from the array
  • array_contains / list_contains — alias for array_has
  • array_distance / list_distance — computes distance between two arrays
  • array_max / list_max — returns the maximum element
  • array_min / list_min — returns the minimum element
  • array_reverse / list_reverse — reverses elements in the array
  • arrays_overlap — checks if two arrays share any elements
  • arrays_zip / list_zip — zips multiple arrays into an array of structs
  • generate_series — generates a series of values
  • string_to_array / string_to_list — splits a string into an array by delimiter

Missing list_* Aliases

The following list_* aliases exist upstream but are not in __all__:

  • list_empty
  • list_pop_back
  • list_pop_front
  • list_has
  • list_has_all
  • list_has_any

Upstream Reference

Implementation

  • Rust bindings: crates/core/src/functions.rs
  • Python wrappers: python/datafusion/functions.py

Note: This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions