Francis Lata
7dba815c47
fix train script
2025-02-19 20:43:02 +00:00
Francis Lata
fc36f09b1e
no need to return loaded keys for resnet
2025-02-19 20:35:03 +00:00
Francis Lata
41378e74a6
model init, hyperparam, and data preprocessing updates
2025-02-19 18:47:06 +00:00
Francis Lata
cfa1c2d50e
hyperparameter adjustments and cleanups
2025-02-14 17:53:06 +00:00
Francis Lata
caf9b2baa2
Merge branch 'master' into retinanet_mlperf
2025-02-14 06:28:37 +00:00
chenyu
9e91898941
bert eval at the end of training ( #9070 )
...
always eval at the last epoch
2025-02-13 16:29:44 -05:00
Francis Lata
3a2f126e7b
Merge branch 'master' into retinanet_mlperf
2025-02-13 15:40:10 +00:00
Francis Lata
5f26692068
remove frozen layers from optimizer's params
2025-02-13 06:36:13 +00:00
chenyu
f4f56d7c15
move time_linearizer to extra.optimization.helpers [pr] ( #9048 )
...
no longer used in tinygrad
2025-02-12 15:49:58 -05:00
Francis Lata
ff301f0be9
minor cleanups
2025-02-12 16:03:38 +00:00
Francis Lata
f61b10450e
Merge branch 'master' into retinanet_mlperf
2025-02-12 15:47:05 +00:00
chenyu
7b5ac2c15e
free_intermediates in bert ( #9040 )
...
also re-enable dropout and update EVAL_BS
2025-02-12 10:00:39 -05:00
Ahmed Harmouche
916d5e7f08
WebGPU f16 support (f16 bounty part 2) ( #8653 )
...
* WebGPU f16 support
* Don't enable f16 yet
* dtype tests passing after bitcast fix
* Maybe all WebGPU green?
* Require shader-f16 in examples
* Minor wgsl touchup
* 1 line shorter
* Simpler
* Add transcendetal support
* log2 nan location mismatch on Vulkan
* Nan skips
2025-02-12 19:46:53 +08:00
George Hotz
0568720a68
delete revectorize ( #9000 )
...
* delete revectorize
* test vectorized LLVM/CLANG
* idk about that
* was that the segfault?
2025-02-10 18:32:35 +08:00
Francis Lata
37aab697b8
adjust LR to be the ratio of the batch size
2025-02-07 19:46:54 +00:00
Francis Lata
041481f910
Merge branch 'master' into retinanet_mlperf
2025-02-07 15:28:29 +00:00
George Hotz
4de084a835
cleanup ci, split docs/autogen, testing_minimal, LLVM Speed [pr] ( #8952 )
...
* cleanup ci [pr]
* testing_minimal
* add hypothesis to minimal
* fail tiktoken import okay
* add LLVM speed test
* llvm speed w/o beam
2025-02-07 19:01:59 +08:00
chenyu
a092b6395d
Tuple -> tuple, List -> list [pr] ( #8936 )
2025-02-06 14:21:19 -05:00
George Hotz
8b16c65bca
add compile3 benchmark [pr] ( #8929 )
2025-02-06 22:49:31 +08:00
geohotstan
6fb0e5751b
hotfix test_onnx_imagenet ( #8897 )
...
* start
* log severity
* only change this
* change abstraction so it's more usable for huggingface
* WHOOPS
* actually this is more correct
2025-02-05 14:39:55 +08:00
geohotstan
057c70b05f
add onnx_helpers to extra and add ort validate to benchmark_onnx ( #8890 )
...
* start
* log severity
* only change this
* change abstraction so it's more usable for huggingface
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-02-04 16:36:01 -05:00
Francis Lata
a483c0d231
Merge branch 'master' into retinanet_mlperf
2025-02-03 19:54:55 +00:00
Francis Lata
83a2b84f55
add validation loop to training script
2025-02-03 19:54:22 +00:00
Francis Lata
f02cce0049
remove unnecessary targets from validation dataloader
2025-02-03 19:15:30 +00:00
George Hotz
f484db0e63
dsp cleanups [pr] ( #8866 )
2025-02-03 15:18:53 +08:00
Francis Lata
932cf4b7f2
fix img_ids repeating its values
2025-02-02 19:21:46 +00:00
Francis Lata
17ae62d741
cleanup boxes and labels in dataloader
2025-02-02 18:51:14 +00:00
Francis Lata
cbacd195e0
Merge branch 'master' into retinanet_mlperf
2025-02-02 14:51:56 +00:00
Francis Lata
594d7126d8
return validation targets in dataloader
2025-02-02 06:50:21 -08:00
George Hotz
42d7c800a1
hotfix: add missing tinychat fonts + other assets
2025-02-01 09:34:44 +08:00
Francis Lata
811893a3bd
cleanup train and validation dataloader
2025-01-31 16:59:37 -08:00
Francis Lata
6d70035c22
put back parallel testing and remove img_ids Tensor from dataloader
2025-01-31 16:13:02 -08:00
Francis Lata
9938a1aabc
remove optional disk tensors in dataloader
2025-01-31 09:07:39 -08:00
Francis Lata
80fa9dd731
fix issue with realized on dataloader
2025-01-31 08:31:25 -08:00
Francis Lata
0a02a55430
Merge branch 'master' into retinanet_mlperf
2025-01-29 10:01:30 -08:00
Francis Lata
335d11281c
add multi device support to retinanet eval
2025-01-29 10:00:46 -08:00
chenyu
c7ca7959e6
set DISABLE_DROPOUT=1 in bert script for now ( #8799 )
2025-01-29 10:51:29 -05:00
Francis Lata
fc957e7377
update validation dataloader and more cleanups
2025-01-28 03:02:17 -08:00
chenyu
c99ae81f63
update default resnet LOSS_SCALER to 256 [pr] ( #8774 )
2025-01-27 16:59:05 -05:00
Francis Lata
3b9e5a3ed4
debug test
2025-01-27 08:35:36 -08:00
Francis Lata
2177053076
Merge branch 'master' into retinanet_mlperf
2025-01-27 08:07:19 -08:00
Francis Lata
e91733baae
refactor on training loop and start the work on val looop
2025-01-27 07:27:19 -08:00
George Hotz
e82ba1454b
MultiLazyBuffer is UOp [pr] ( #8662 )
...
* MultiLazyBuffer is UOp [pr]
* this is new mlb
* this is the idea
* progress
* multitensor works
* more movement ops
* this
* MultiLazyBuffer is UOp
* cleanups
* multi axis
* fix more tests
* work
* not that
* add multi grad and move shard to ops
* mops not views
* no double contig
* sweet, all mt tests passing
* port old logic
* remove lbs
* fix realized
* whitespace
* assign tweak
* test_assign_kv_cache_multi passes
* fix is_realized
* fix JIT for multi
* just a few more lines i'll pay them back soon i swear please bro just a few more
* no split reduceop for multi
2025-01-24 13:28:55 +09:00
chenyu
eb77488f85
update llama3 70B to use R1 ( #8733 )
2025-01-23 19:06:05 -05:00
chenyu
af65331b76
update beam params for bert green [pr] ( #8726 )
...
increase BEAM_UPCAST_MAX and BEAM_LOCAL_MAX to default and matched red. 3% faster step
2025-01-22 22:00:05 -05:00
Francis Lata
6fdcaa178b
add checkpointing and training resume capabilities
2025-01-22 14:20:17 -08:00
Francis Lata
95cdbbf237
add jit to the training loop
2025-01-22 12:31:29 -08:00
Francis Lata
efe64ebeaf
enable lr scheduler and fix benchmark timing
2025-01-22 09:56:38 -08:00
chenyu
9a9079118e
envvar BERT_LAYERS [pr] ( #8709 )
...
default is 24 for large
2025-01-21 22:49:19 -05:00
chenyu
9f6d545a16
bert log global_norm in training step [pr] ( #8708 )
...
* bert log global_norm in training step [pr]
and minor cleanups
* .item()
2025-01-21 20:36:27 -05:00