2023-08-06 11:22:14 -07:00

61 lines
1.8 KiB
Markdown

## ⛓🛠 blockchain data engineering
<br>
##### 👉 this repository contains my blockchain engineering projects such as scalable event scanners and infrastructure setups for on-chain analysis and machine learning models training (*e.g.*, HFT with deep learning).
##### 🛠 here is a high-level system design chart for a possible blockchain intelligence data platform (all deployed in kubernetes):
<br>
<p align="center">
<img src="https://user-images.githubusercontent.com/1130416/224561453-274c5066-240d-4cc5-b63b-b4c57388a0e0.png" width="80%" align="center" style="padding:1px;border:1px solid black;"/>
<br>
<br>
---
### scanners
<br>
* **[token-scanner-api](token-scanner-api)**:
- a mvp for a **scalable event scanner cli and api for ethereum**, through indexing and parsing blocks events. this is the first step for training **machine learning models on the chains** (e.g., high-frequency trading with deep learning).
- check my mirror post **[on building a scalable event scanner for ethereum](https://mirror.xyz/steinkirch.eth/vSF18xcLyfXLIWwxjreRa3I_XskwgnjSc6pScegNJWI)**.
<br>
-----
### technologies
<br>
* **[apache arrow](technologies/arrow_project.md)**
* **[rlp enconding](technologies/rlp_enconding.md)**
* **[spotify's luigi](technologies/luigi.md)**
* **[google's or-tools](technologies/or_tools.md)**
<br>
---
### external resources
<br>
* **[go-outside-labs ml-hft-agents](https://github.com/go-outside-labs/hft-deep-learning-agents)**
* **[go-outside-labs orchestration-toolkit](https://github.com/go-outside-labs/orchestration-toolkit)**
* **[google biquery article on blockchain public datasets](https://cloud.google.com/blog/products/data-analytics/introducing-six-new-cryptocurrencies-in-bigquery-public-datasets-and-how-to-analyze-them)**
* **[paradigm's data portal](https://data.paradigm.xyz/)**