clean up training docs

This commit is contained in:
Alex O'Connell
2024-08-16 22:45:27 -04:00
parent 0487a82f67
commit 9d0295b4f5
2 changed files with 81 additions and 57 deletions

View File

@@ -84,6 +84,53 @@ if mary is 7 years old, and I am 3 years older than her. how old am I?<|endoftex
If Mary is 7 years old, then you are 10 years old (7+3=10).<|endoftext|>
```
<details>
<summary>Training Details</summary>
The 3B model was trained as a full fine-tuning on 2x RTX 4090 (48GB). Training time took approximately 28 hours. It was trained on the `--large` dataset variant.
```console
accelerate launch --config_file fsdp_config.yaml train.py \
--run_name home-3b \
--base_model stabilityai/stablelm-zephyr-3b \
--bf16 \
--train_dataset data/home_assistant_train.jsonl \
--learning_rate 1e-5 \
--batch_size 64 \
--epochs 1 \
--micro_batch_size 2 \
--gradient_checkpointing \
--group_by_length \
--ctx_size 2048 \
--save_steps 50 \
--save_total_limit 10 \
--eval_steps 100 \
--logging_steps 2
```
The 1B model was trained as a full fine-tuning on an RTX 3090 (24GB). Training took approximately 2 hours. It was trained on the `--medium` dataset variant.
```console
python3 train.py \
--run_name home-1b \
--base_model TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
--bf16 \
--train_dataset data/home_assistant_train.jsonl \
--test_dataset data/home_assistant_test.jsonl \
--learning_rate 2e-5 \
--batch_size 32 \
--micro_batch_size 8 \
--gradient_checkpointing \
--group_by_length \
--ctx_size 2048 \
--save_steps 100 \
--save_total_limit 10
--prefix_ids 29966,29989,465,22137,29989,29958,13 \
--suffix_ids 2
```
</details>
### Synthetic Dataset
The synthetic dataset is aimed at covering basic day to day operations in home assistant such as turning devices on and off.
The supported entity types are: light, fan, cover, lock, media_player, climate, switch

File diff suppressed because one or more lines are too long