Blocks Dataset

The blocks dataset allows you to extract block headers from any EVM chain. This is one of the most fundamental datasets for blockchain analysis.

Usage

from cherry_etl import datasets
from cherry_etl.pipeline import run_pipeline

# Create a pipeline for blocks
pipeline = datasets.evm.blocks(
    provider=provider,
    writer=writer,
    from_block=18123123,  # Starting block number
    to_block=18123200     # Ending block number
)

# Run the pipeline
await run_pipeline(pipeline_name="blocks", pipeline=pipeline)

Output Schema

FieldTypeDescription
numberuint64Block number
hashstringBlock hash
parent_hashstringHash of the parent block
noncestringBlock nonce
logs_bloomstringBloom filter for logs
transactions_rootstringMerkle root of transactions
state_rootstringMerkle root of state
receipts_rootstringMerkle root of receipts
minerstringAddress of the block miner
difficultyDecimal128(38, 0)Block difficulty
total_difficultyDecimal128(38, 0)Cumulative difficulty
extra_datastringExtra data field
sizeDecimal128(38, 0)Block size in bytes
gas_limitDecimal128(38, 0)Gas limit for the block
gas_usedDecimal128(38, 0)Gas used in the block
timestampDecimal128(38, 0)Block timestamp
uncleslist(binary)List of uncle block hashes
base_fee_per_gasDecimal128(38, 0)Base fee per gas (EIP-1559)
withdrawals_rootstringRoot hash of withdrawals (Shanghai upgrade)

Example Queries

Get Latest Block

SELECT * FROM blocks ORDER BY number DESC LIMIT 1;

Get Blocks in a Time Range

SELECT * FROM blocks 
WHERE timestamp BETWEEN 1704067200 AND 1704153600;  -- Last 24 hours

Get Block Statistics

SELECT 
    MIN(number) as first_block,
    MAX(number) as last_block,
    AVG(gas_used) as avg_gas_used,
    AVG(size) as avg_block_size
FROM blocks;

Notes

  • The dataset includes all fields from EVM block header
  • Timestamps are in Unix timestamp format
  • The base_fee_per_gas field is only available for blocks after the London hard fork (EIP-1559)
  • The withdrawals_root field is only available for blocks after the Shanghai upgrade