it did not ask too many details. will put software versions later with tinygrad commit.
```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_red.json training 4.0.0
INFO - System description checker passed for tinybox red
```
```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_green.json training 4.
0.0
INFO - System description checker passed for tinybox green
```
* fix mean underflow for half tensor
divide only the reduce factor. added unit test and non-nan assertion in resnet training. also added a failed test cast for symbolic shape var
* skip for python backend
* kernel: change PADTO check to allow up to 4x padding
also optionally remove PADTO from the search action space with
BEAM_PADTO=0.
* fix test_linearizer test_tensor_cores_padded tests
* update resnet runs to use SPLIT_REDUCEOP=1
* fix up search TC axis and amt checking
* fix up the dimensions of the TC tests
* add support for train/val datasets for kits19
* split dataset into train and val sets
* add tests for kits19 dataloader
* add MLPerf dataset tests to CI
* update unet3d model_eval script
* fix linting
* add nibabel
* fix how mock dataset gets created
* update ref implementation with permalink and no edits
* clean up test and update rand_flip implementation
* cleanups
we want to have different BEAM values for resnet train and eval. global JITBEAM cannot do this. added the flag to change beam behavior at cnt=0 (so it default behaves the same with or without TinyJit), and for cnt=1 it uses existing BEAM.value.
Also updated the context var BEAM in resnet to be outside of TinyJit. saves about 3 minutes compile time
* add DICE loss and metrics
* update dice to include reference implementation's link
* remove unused imports
* remove unnecessary test file and update pred + label for metrics and losses test
* add tests to CI + add exclusion of mlperf_unet3d
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* resnet individual layer benchmarks!
* small
* 1 and 2
* mem_used
* no ci
* better conv print
* defaults
* prints
* adjust
* adjust
* adjust
* benchmark only one layer example
* tensor.training, zero_grad, sum instead of mean, last mem, last kernel count
* default jitcnt=1
* scale flops/kernels with jitcnt
* add note about jitcnt memory
* touchup
* write llm.c and add a few new methods to tensor
* training works
* add jit
* tests for new functions
* test tolist
* simple fix for onnx test failures (#4186)
* write llm.c and add a few new methods to tensor
* training works
* add jit
* tests for new functions
* bump line count to 7500
* simplest fix
* safenumpy tolist for now
---------
Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
---------
Co-authored-by: geohotstan <135171913+geohotstan@users.noreply.github.com>
* rewrite the jit in the context of new schedule
* mypy better
* fix placeholder
* tests
* all functionality should work
* fix tests
* no CacheCollector