Introduction

PyPI Telegram GitHub

Cherry is a python library for building blockchain data pipelines.

It is designed to make building production-ready blockchain data pipelines easy.

Getting Started

See getting started section of the docs.

Features

  • Pure python library. Don't need yaml, SQL, toml etc.
  • High-level datasets API and flexible pipeline API.
  • High-performance, low-cost and uniform data access. Ability to use advanced providers without platform lock-in.
  • Included functionality to decode, validate, transform blockchain data. All implemented in rust for performance.
  • Write transformations using polars, pyarrow, datafusion, pandas, duckdb or any other pyarrow compatible library.
  • Schema inference automatically creates output tables.
  • Keep datasets fresh with continuous ingestion.
  • Parallelized, next batch of data is being fetched while your pre-processing function is running, while the database writes are being executed in parallel. Don't need to hand optimize anything.
  • Included library of transformations.
  • Included functionality to implement crash-resistance.

Data providers

ProviderEthereum (EVM)Solana (SVM)
HyperSync
SQD
Yellowstone-GRPC

Supported output formats

  • ClickHouse
  • Iceberg
  • Deltalake
  • DuckDB
  • Arrow Datasets
  • Parquet

Examples

See examples on github.

And getting started section.

License

Licensed under either of

  • Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
  • MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Sponsors