sudo apt update
sudo apt install vim git jq expect tmux parallel python3 python3-tk bc curl python3-pip -y
pip3 install numpy
cd ~ && wget https://downloads.getmonero.org/cli/monero-linux-x64-v0.17.3.0.tar.bz2
tar -xvf monero-linux-x64-v0.17.3.0.tar.bz2 && cd monero-x86_64-linux-gnu-v0.17.3.0 && sudo cp monero* /usr/bin/ && cd ..
git clone git@github.com:ACK-J/Monero-Dataset-Pipeline.git && cd XMR-Transaction-Automation
chmod +x ./run.sh && chmod 777 -R Funding_Wallets/
# Make sure global variables are set
./run.sh

Stagenet Dataset

File	Size	Serialized	Description
`dataset.csv`	1.4GB		The exhaustive dataset including all metadata for each transaction in csv format.
`dataset.json`	1.5GB	✅	The exhaustive dataset including all metadata for each transaction in json format.
`X.csv`	4.1GB		A modified version of dataset.csv with all features irrelevant to machine learning removed, in csv format.
`X.pkl`	6.5GB	✅	A modified version of dataset.json with all features irrelevant to machine learning removed, as a pickled pandas dataframe.
`y.pkl`	9.5MB	✅	A pickled list of python dictionaries which contain private information regarding the coresponding index of X.pkl.
`X_Undersampled.csv`	1.4GB		A modified version of X.csv with all data points shuffled and undersampled.
`X_Undersampled.pkl`	2.3GB	✅	A modified version of X.pkl with all data points shuffled and undersampled.
`y_Undersampled.pkl`	325kB	✅	A pickled list containing the labels coresponding to the index of X_Undersampled.pkl.

Dataset Download Link

Stagenet_Dataset_7_2_2022.7z 837 MB

Includes all files mentioned above in the Stagenet Dataset table, compressed using 7-zip
https://drive.google.com/file/d/1cmkb_7_cVe_waLdVJ9USdK07SPWgdgva/view

How to load the dataset using Python and pickle

import pickle

with open("./Dataset_Files/dataset.json", "r") as fp:
    data = json.load(fp)
    
with open("./Dataset_Files/X.pkl", "rb") as fp:
    X = pickle.load(fp)
    
with open("./Dataset_Files/y.pkl", "rb") as fp:
    y = pickle.load(fp)
    
with open("./Dataset_Files/X_Undersampled.pkl", "rb") as fp:
    X_Undersampled = pickle.load(fp)
    
with open("./Dataset_Files/y_Undersampled.pkl", "rb") as fp:
    y_Undersampled = pickle.load(fp)

Problem Solving and Useful Commands

If Collect.sh throws the error: `Failed to create a read transaction for the db: MDB_READERS_FULL: Environment maxreaders limit reached`

/home/user/monero/external/db_drivers/liblmdb/mdb_stat -rr ~/.bitmonero/testnet/lmdb/

Check progress of collect.sh while its running

find ./ -iname *.csv | cut -d '/' -f 2 | sort -u

After running collect.sh gather the ring positions

find . -name "*outgoing*" | xargs cat | cut -f 6 -d ',' | grep -v Ring_no/Ring_size | cut -f 1 -d '/'

Data Collection Pipeline Flowcharts

README.md

Monero Dataset Pipeline

Installation

Stagenet Dataset

Dataset Download Link

How to load the dataset using Python and pickle

Problem Solving and Useful Commands

If Collect.sh throws the error: `Failed to create a read transaction for the db: MDB_READERS_FULL: Environment maxreaders limit reached`

Check progress of collect.sh while its running

After running collect.sh gather the ring positions

Data Collection Pipeline Flowcharts

Run.sh

Collect.sh

Create_Dataset.py

README.md

Monero Dataset Pipeline

Installation

Stagenet Dataset

Dataset Download Link

How to load the dataset using Python and pickle

Problem Solving and Useful Commands

If Collect.sh throws the error: Failed to create a read transaction for the db: MDB_READERS_FULL: Environment maxreaders limit reached

Check progress of collect.sh while its running

After running collect.sh gather the ring positions

Data Collection Pipeline Flowcharts

Run.sh

Collect.sh

Create_Dataset.py

If Collect.sh throws the error: `Failed to create a read transaction for the db: MDB_READERS_FULL: Environment maxreaders limit reached`