Introduction
Cherry is a python library for building blockchain data pipelines.
It is designed to make building production-ready blockchain data pipelines easy.
Getting Started
See getting started section of the docs.
Features
- Pure
python
library. Don't need yaml, SQL, toml etc. - High-level
datasets
API and flexible pipeline API. High-performance
,low-cost
anduniform
data access. Ability to use advanced providers without platform lock-in.- Included functionality to
decode
,validate
,transform
blockchain data. All implemented inrust
for performance. - Write transformations using
polars
,pyarrow
,datafusion
,pandas
,duckdb
or any otherpyarrow
compatible library. Schema inference
automatically creates output tables.- Keep datasets fresh with
continuous ingestion
. Parallelized
, next batch of data is being fetched while your pre-processing function is running, while the database writes are being executed in parallel. Don't need to hand optimize anything.- Included library of transformations.
- Included functionality to implement
crash-resistance
.
Data providers
Provider | Ethereum (EVM) | Solana (SVM) |
---|---|---|
HyperSync | ✅ | ❌ |
SQD | ✅ | ✅ |
Yellowstone-GRPC | ❌ | ✅ |
Supported output formats
- ClickHouse
- Iceberg
- Deltalake
- DuckDB
- Arrow Datasets
- Parquet
Examples
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.