Compare commits

..

50 Commits

Author SHA1 Message Date
Evgeny Medvedev
fbd57fc079 Merge pull request #523 from GarmashAlex/fix/broken-link
docs: fix broken link
2025-08-27 09:07:52 +07:00
GarmashAlex
8204c0827d docs: fix broken link 2025-08-26 19:09:29 +03:00
Evgeny Medvedev
46b91a9ff2 Merge pull request #522 from mdqst/patch-1
docs: fix broken video link
2025-08-22 20:49:33 +07:00
Dmitry
b5fd64bdca docs: fix broken video link 2025-08-22 14:50:56 +03:00
Evgeny Medvedev
d8547e9c7c Merge pull request #521 from Galoretka/fix/broken-link
fix(docs): update Ethereum JSON-RPC links in ETL jobs
2025-08-21 14:54:21 +07:00
Galoretka
7ef53859c1 fix:broken link 2025-08-21 10:51:51 +03:00
Galoretka
e38d1c1f2f fix: broken link 2025-08-21 10:51:22 +03:00
Evgeny Medvedev
43fe6b49b3 Merge pull request #519 from blockchain-etl/medvedev1088-patch-1
Remove gitter link in README.md
2025-04-30 15:26:44 +07:00
Evgeny Medvedev
db274c8a85 Update README.md 2025-04-30 15:24:40 +07:00
Evgeny Medvedev
69247042a4 Merge pull request #518 from oksanaphmn/patch-1
docs: add license badge
2025-04-30 15:23:15 +07:00
oksanaphmn
218e1e4356 Update README.md 2025-04-27 13:16:05 +03:00
Evgeny Medvedev
5e0fc8cc75 Merge pull request #516 from gap-editor/develop
deleted link to discord from 'contact.md'
2025-04-05 09:13:55 +07:00
Maximilian Hubert
77efda5106 Update contact.md 2025-04-04 20:35:34 +02:00
Evgeny Medvedev
ece0b7f422 Merge pull request #515 from VolodymyrBg/bg
docs: extension of documentation in index.md with the addition of adv…
2025-04-04 21:34:37 +07:00
VolodymyrBg
b31b76a73a Update index.md 2025-04-04 17:33:03 +03:00
VolodymyrBg
0cb7eb60b5 docs: extension of documentation in index.md with the addition of advanced features and new projects 2025-04-02 20:02:14 +03:00
Evgeny Medvedev
02943f7caf Merge pull request #514 from blockchain-etl/medvedev1088-patch-1
Update exporting-the-blockchain.md
2025-04-01 09:23:59 +07:00
Evgeny Medvedev
b844b95868 Update exporting-the-blockchain.md 2025-04-01 09:22:53 +07:00
Evgeny Medvedev
4d305a284f Merge pull request #513 from Hopium21/patch-1
remove broken link to D5.ai
2025-04-01 09:22:15 +07:00
Hopium
e161e6ef13 Update exporting-the-blockchain.md 2025-03-31 20:28:23 +02:00
Evgeny Medvedev
9b917b8ddd Update README.md 2025-03-04 19:15:39 +07:00
Evgeny Medvedev
383caf8331 Merge pull request #511 from Radovenchyk/patch-2
docs: removed the discord link
2025-03-04 19:13:25 +07:00
Radovenchyk
c61e91235f Update README.md 2025-03-04 11:36:05 +02:00
Evgeny Medvedev
0e4b4a894b Merge pull request #510 from Radovenchyk/patch-1
docs: added shield and twitter link
2025-03-03 21:50:15 +07:00
Radovenchyk
d58c1ebda7 Update README.md 2025-03-03 16:37:36 +02:00
Evgeny Medvedev
f0bf07e60c Merge pull request #509 from maximevtush/patch-1
Update LICENSE
2025-01-30 18:16:00 +08:00
Maxim Evtush
efe7acdc13 Update LICENSE 2025-01-30 11:07:04 +01:00
Evgeny Medvedev
20404eca9e Merge pull request #506 from romashka-btc/code/fix
typos/fix
2024-12-19 11:33:47 +08:00
Romashka
435cbe0a74 typo-Update exporters.py 2024-12-18 20:36:43 +02:00
Romashka
b606e22cd5 typo-Update exporters.py 2024-12-18 20:36:18 +02:00
Evgeny Medvedev
4943b0b795 Merge pull request #505 from XxAlex74xX/patch-1
typo README.md
2024-12-18 15:51:25 +08:00
XxAlex74xX
eed2068def Update README.md 2024-12-18 07:38:38 +01:00
Evgeny Medvedev
313b4b1237 Merge pull request #503 from Guayaba221/develop
docs fix spelling issues
2024-12-15 21:00:53 +08:00
planetBoy
ad6149155e Update exporting-the-blockchain.md 2024-12-15 10:11:54 +01:00
Evgeny Medvedev
c55c0f68dc Merge pull request #502 from futreall/develop
Fix significant typo in documentation
2024-12-15 11:43:33 +08:00
futreall
b031b04bc7 Update google-bigquery.md 2024-12-14 20:40:47 +02:00
Evgeny Medvedev
b314f1ed0c Merge pull request #501 from vtjl10/develop
fix: typos in documentation files
2024-12-15 00:21:31 +08:00
fuder.eth
61eb2e6e21 Update README.md 2024-12-14 13:27:01 +01:00
Evgeny Medvedev
9f62e7ecea Merge pull request #492 from nnsW3/docs-improvement
Docs improvement
2024-06-26 09:41:09 +08:00
Elias Rad
4da7e7b23f fix README.md 2024-06-25 20:06:41 +03:00
Elias Rad
de72ba3511 fix origin.py 2024-06-25 20:04:51 +03:00
Elias Rad
3aabf9aa54 fix schema.md 2024-06-25 20:02:55 +03:00
Elias Rad
284755bafc fix limitations.md 2024-06-25 20:02:26 +03:00
Elias Rad
23133594e8 fix index.md 2024-06-25 20:02:14 +03:00
evgeny
ca54ef6c4b Bump version 2024-04-11 19:42:39 +07:00
Evgeny Medvedev
836f30e198 Merge pull request #488 from blockchain-etl/add_dencun_fields_to_postgres_tables
Add Dencun fields to postgres_tables.py
2024-04-11 20:41:49 +08:00
evgeny
1c6508f15d Add Dencun fields to postgres_tables.py 2024-04-11 19:38:27 +07:00
Evgeny Medvedev
a4d6f8fcb1 Merge pull request #487 from blockchain-etl/add_readthedocs_yaml
Add .readthedocs.yaml
2024-04-11 10:58:07 +08:00
evgeny
bc79d7d9bf Add .readthedocs.yaml 2024-04-11 09:56:49 +07:00
medvedev1088
7fdcf0f7b7 Bump version 2024-04-03 12:42:38 +08:00
21 changed files with 84 additions and 31 deletions

14
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,14 @@
# Read the Docs configuration file for MkDocs projects
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.12"
mkdocs:
configuration: mkdocs.yml

View File

@@ -1,6 +1,6 @@
MIT License
Copyright (c) 2018 Evgeny Medvedev, evge.medvedev@gmail.com, https://twitter.com/EvgeMedvedev
Copyright (c) 2018-2025 Evgeny Medvedev, evge.medvedev@gmail.com, https://twitter.com/EvgeMedvedev
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
@@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
SOFTWARE.

View File

@@ -1,9 +1,9 @@
# Ethereum ETL
[![Build Status](https://app.travis-ci.com/blockchain-etl/ethereum-etl.svg?branch=develop)](https://travis-ci.com/github/blockchain-etl/ethereum-etl)
[![Join the chat at https://gitter.im/ethereum-eth](https://badges.gitter.im/ethereum-etl.svg)](https://gitter.im/ethereum-etl/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![License](https://img.shields.io/github/license/blockchain-etl/ethereum-etl)](https://github.com/blockchain-etl/ethereum-etl/blob/develop/LICENSE)
[![Telegram](https://img.shields.io/badge/telegram-join%20chat-blue.svg)](https://t.me/BlockchainETL)
[![Discord](https://img.shields.io/badge/discord-join%20chat-blue.svg)](https://discord.gg/tRKG7zGKtF)
[![Twitter](https://img.shields.io/twitter/follow/EthereumETL)](https://x.com/EthereumETL)
Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
@@ -27,7 +27,7 @@ Export blocks and transactions ([Schema](docs/schema.md#blockscsv), [Reference](
--provider-uri https://mainnet.infura.io/v3/7aef3f0cd1f64408b163814b22cc643c
```
Export ERC20 and ERC721 transfers ([Schema](docs/schema.md#token_transferscsv), [Reference](docs/commands.md##export_token_transfers)):
Export ERC20 and ERC721 transfers ([Schema](docs/schema.md#token_transferscsv), [Reference](docs/commands.md#export_token_transfers)):
```bash
> ethereumetl export_token_transfers --start-block 0 --end-block 500000 \
@@ -78,7 +78,7 @@ For the latest version, check out the repo and call
```bash
> pip3 install -e .[dev,streaming]
> export ETHEREUM_ETL_RUN_SLOW_TESTS=True
> export PROVIDER_URL=<your_porvider_uri>
> export PROVIDER_URL=<your_provider_uri>
> pytest -vv
```
@@ -109,9 +109,9 @@ For the latest version, check out the repo and call
> echo "Stream to console"
> docker run ethereum-etl:latest stream --start-block 500000 --log-file log.txt
> echo "Stream to Pub/Sub"
> docker run -v /path_to_credentials_file/:/ethereum-etl/ --env GOOGLE_APPLICATION_CREDENTIALS=/ethereum-etl/credentials_file.json ethereum-etl:latest stream --start-block 500000 --output projects/<your-project>/topics/crypto_ethereum
> docker run -v /path_to_credentials_file/:/ethereum-etl/ --env GOOGLE_APPLICATION_CREDENTIALS=/ethereum-etl/credentials_file.json ethereum-etl:latest stream --start-block 500000 --output projects/<your_project>/topics/crypto_ethereum
If running on Apple M1 chip add the `--platform linux/x86_64` option to the `build` and `run` commands e.g.:
If running on an Apple M1 chip add the `--platform linux/x86_64` option to the `build` and `run` commands e.g.:
```
docker build --platform linux/x86_64 -t ethereum-etl:latest .

View File

@@ -45,7 +45,7 @@ class BaseItemExporter(object):
self._configure(kwargs)
def _configure(self, options, dont_fail=False):
"""Configure the exporter by poping options from the ``options`` dict.
"""Configure the exporter by popping options from the ``options`` dict.
If dont_fail is set, it won't raise an exception on unexpected options
(useful for using with keyword arguments in subclasses constructors)
"""

View File

@@ -30,13 +30,18 @@
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
class SimpleItemConverter:
def __init__(self, field_converters=None):
self.field_converters = field_converters
def convert_item(self, item):
return {
key: self.convert_field(key, value) for key, value in item.items()
}
def convert_field(self, key, value):
return value
if self.field_converters is not None and key in self.field_converters:
return self.field_converters[key](value)
else:
return value

View File

@@ -1,4 +1,3 @@
# Contact
- [D5 Discord Server](https://discord.gg/wukrezR)
- [Telegram Group](https://t.me/joinchat/GsMpbA3mv1OJ6YMp3T5ORQ)

View File

@@ -1,7 +1,5 @@
## Exporting the Blockchain
If you'd like to have blockchain data set up and hosted for you, [get in touch with us at D5](https://d5.ai/?ref=ethereumetl).
1. Install python 3.5.3+: [https://www.python.org/downloads/](https://www.python.org/downloads/)
1. You can use Infura if you don't need ERC20 transfers (Infura doesn't support eth_getFilterLogs JSON RPC method).

View File

@@ -1,4 +1,4 @@
# Google BiqQuery
# Google BigQuery
## Querying in BigQuery
@@ -16,4 +16,4 @@ Read [this article](https://medium.com/google-cloud/building-token-recommender-i
### Awesome BigQuery Views
[https://github.com/blockchain-etl/awesome-bigquery-views](https://github.com/blockchain-etl/awesome-bigquery-views)
[https://github.com/blockchain-etl/awesome-bigquery-views](https://github.com/blockchain-etl/awesome-bigquery-views)

View File

@@ -2,7 +2,7 @@
Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
With 1,700+ likes on GitHub, Ethereum ETL is the most popular open source project for Ethereum data.
With 1,700+ likes on GitHub, Ethereum ETL is the most popular open-source project for Ethereum data.
Data is available for you to query right away in [Google BigQuery](https://goo.gl/oY5BCQ).
@@ -17,8 +17,31 @@ Easily export:
* Receipts
* Logs
* Contracts
* Internal transactions
* Internal transactions (traces)
## Advanced Features
* Stream blockchain data to Pub/Sub, Postgres, or other destinations in real-time
* Filter and transform data using flexible command-line options
* Support for multiple Ethereum node providers (Geth, Parity, Infura, etc.)
* Handles chain reorganizations through configurable lag
* Export data by block range or by date
* Scalable architecture with configurable batch sizes and worker counts
## Use Cases
* Data analysis and visualization
* Machine learning on blockchain data
* Building analytics dashboards
* Market research and token analysis
* Compliance and audit reporting
* Academic research on blockchain economics
## Projects using Ethereum ETL
* [Google](https://goo.gl/oY5BCQ) - Public BigQuery Ethereum datasets
* [Nansen](https://nansen.ai/query?ref=ethereumetl) - Analytics platform for Ethereum
* [Ethereum Blockchain ETL on GCP](https://cloud.google.com/blog/products/data-analytics/ethereum-bigquery-public-dataset-smart-contract-analytics) - Official Google Cloud reference architecture
## Getting Started
Check the [Quickstart](quickstart.md) guide to begin using Ethereum ETL or explore the [Commands](commands.md) page for detailed usage instructions.

View File

@@ -4,7 +4,7 @@
which means `is_erc20` and `is_erc721` will always be false for proxy contracts and they will be missing in the `tokens`
table.
- The metadata methods (`symbol`, `name`, `decimals`, `total_supply`) for ERC20 are optional, so around 10% of the
contracts are missing this data. Also some contracts (EOS) implement these methods but with wrong return type,
contracts are missing this data. Also some contracts (EOS) implement these methods but with the wrong return type,
so the metadata columns are missing in this case as well.
- `token_transfers.value`, `tokens.decimals` and `tokens.total_supply` have type `STRING` in BigQuery tables,
because numeric types there can't handle 32-byte integers. You should use
@@ -12,4 +12,4 @@ because numeric types there can't handle 32-byte integers. You should use
`safe_cast(value as NUMERIC)` (possible overflow) to convert to numbers.
- The contracts that don't implement `decimals()` function but have the
[fallback function](https://solidity.readthedocs.io/en/v0.4.21/contracts.html#fallback-function) that returns a `boolean`
will have `0` or `1` in the `decimals` column in the CSVs.
will have `0` or `1` in the `decimals` column in the CSVs.

View File

@@ -7,5 +7,4 @@
- [Introducing six new cryptocurrencies in BigQuery Public Datasets—and how to analyze them](https://cloud.google.com/blog/products/data-analytics/introducing-six-new-cryptocurrencies-in-bigquery-public-datasets-and-how-to-analyze-them)
- [Querying the Ethereum Blockchain in Snowflake](https://community.snowflake.com/s/article/Querying-the-Ethereum-Blockchain-in-Snowflake)
- [ConsenSys Grants funds third cohort of projects to benefit the Ethereum ecosystem](https://www.cryptoninjas.net/2020/02/17/consensys-grants-funds-third-cohort-of-projects-to-benefit-the-ethereum-ecosystem/)
- [Ivan on Tech overviews crypto datasets in BigQuery](https://youtu.be/2IkJBNhsXNY?t=239)
- [Unlocking the Power of Google BigQuery (Cloud Next '19)](https://youtu.be/KL_i5XZIaJg?t=131)

View File

@@ -153,7 +153,7 @@ trace_id | string |
### Differences between geth and parity traces.csv
- `to_address` field differs for `callcode` trace (geth seems to return correct value, as parity value of `to_address` is same as `to_address` of parent call);
- `to_address` field differs for `callcode` trace (geth seems to return correct value, as parity value of `to_address` is the same as `to_address` of parent call);
- geth output doesn't have `reward` traces;
- geth output doesn't have `to_address`, `from_address`, `value` for `suicide` traces;
- `error` field contains human readable error message, which might differ in geth/parity output;

View File

@@ -48,7 +48,7 @@ from ethereumetl.cli.stream import stream
@click.group()
@click.version_option(version='2.4.0')
@click.version_option(version='2.4.2')
@click.pass_context
def cli(ctx):
pass

View File

@@ -44,7 +44,7 @@ class BaseItemExporter(object):
self._configure(kwargs)
def _configure(self, options, dont_fail=False):
"""Configure the exporter by poping options from the ``options`` dict.
"""Configure the exporter by popping options from the ``options`` dict.
If dont_fail is set, it won't raise an exception on unexpected options
(useful for using with keyword arguments in subclasses constructors)
"""

View File

@@ -15,7 +15,7 @@ def get_origin_ipfs_client():
# Parses the shop's HTML index page to extract the name of the IPFS directory under
# which all the shops data is located.
# which all the shop data is located.
def _get_shop_data_dir(shop_index_page):
match = re.search('<link rel="data-dir" href="(.+?)"', shop_index_page)
return match.group(1) if match else None

View File

@@ -95,7 +95,7 @@ class ExportOriginJob(BaseJob):
})
for batch in batches:
# https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_getfilterlogs
# https://ethereum.org/en/developers/docs/apis/json-rpc/#eth_getfilterlogs
filter_params = {
'address': batch['contract_address'],
'fromBlock': batch['from_block'],

View File

@@ -65,7 +65,7 @@ class ExportTokenTransfersJob(BaseJob):
def _export_batch(self, block_number_batch):
assert len(block_number_batch) > 0
# https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_getfilterlogs
# https://ethereum.org/en/developers/docs/apis/json-rpc/#eth_getfilterlogs
filter_params = {
'fromBlock': block_number_batch[0],
'toBlock': block_number_batch[-1],

View File

@@ -54,7 +54,7 @@ class EthContractService:
c.implements('allowance(address,address)')
# https://github.com/ethereum/EIPs/blob/master/EIPS/eip-721.md
# https://github.com/OpenZeppelin/openzeppelin-solidity/blob/master/contracts/token/ERC721/ERC721Basic.sol
# https://github.com/OpenZeppelin/openzeppelin-contracts/blob/master/contracts/token/ERC721/ERC721.sol
# Doesn't check the below ERC721 methods to match CryptoKitties contract
# getApproved(uint256)
# setApprovalForAll(address,bool)

View File

@@ -62,8 +62,12 @@ def create_item_exporter(output):
from blockchainetl.jobs.exporters.converters.unix_timestamp_item_converter import UnixTimestampItemConverter
from blockchainetl.jobs.exporters.converters.int_to_decimal_item_converter import IntToDecimalItemConverter
from blockchainetl.jobs.exporters.converters.list_field_item_converter import ListFieldItemConverter
from blockchainetl.jobs.exporters.converters.simple_item_converter import SimpleItemConverter
from ethereumetl.streaming.postgres_tables import BLOCKS, TRANSACTIONS, LOGS, TOKEN_TRANSFERS, TRACES, TOKENS, CONTRACTS
def array_to_str(val):
return ','.join(val) if val is not None else None
item_exporter = PostgresItemExporter(
output, item_type_to_insert_stmt_mapping={
'block': create_insert_statement_for_table(BLOCKS),
@@ -74,8 +78,12 @@ def create_item_exporter(output):
'token': create_insert_statement_for_table(TOKENS),
'contract': create_insert_statement_for_table(CONTRACTS),
},
converters=[UnixTimestampItemConverter(), IntToDecimalItemConverter(),
ListFieldItemConverter('topics', 'topic', fill=4)])
converters=[
UnixTimestampItemConverter(),
IntToDecimalItemConverter(),
ListFieldItemConverter('topics', 'topic', fill=4),
SimpleItemConverter(field_converters={'blob_versioned_hashes': array_to_str})
])
elif item_exporter_type == ItemExporterType.GCS:
from blockchainetl.jobs.exporters.gcs_item_exporter import GcsItemExporter
bucket, path = get_bucket_and_path_from_gcs_output(output)

View File

@@ -49,6 +49,9 @@ BLOCKS = Table(
Column('gas_used', BigInteger),
Column('transaction_count', BigInteger),
Column('base_fee_per_gas', BigInteger),
Column('withdrawals_root', String),
Column('blob_gas_used', BigInteger),
Column('excess_blob_gas', BigInteger),
)
TRANSACTIONS = Table(
@@ -78,6 +81,10 @@ TRANSACTIONS = Table(
Column('receipt_l1_gas_used', BigInteger),
Column('receipt_l1_gas_price', BigInteger),
Column('receipt_l1_fee_scalar', Float),
Column('max_fee_per_blob_gas', BigInteger),
Column('blob_versioned_hashes', String),
Column('receipt_blob_gas_price', BigInteger),
Column('receipt_blob_gas_used', BigInteger),
)
LOGS = Table(

View File

@@ -11,7 +11,7 @@ long_description = read('README.md') if os.path.isfile("README.md") else ""
setup(
name='ethereum-etl',
version='2.4.0',
version='2.4.2',
author='Evgeny Medvedev',
author_email='evge.medvedev@gmail.com',
description='Tools for exporting Ethereum blockchain data to CSV or JSON',