* file path as input and have parse be in OnnxRunner.__init__
* modelproto_to_onnxrunner -> modelproto_to_runner
* whoops, fix import
* oh flakiness again, is it because it's getting gc-ed?
* small changes
* CI flaky so just move compile4 fix in
* copy typing of onnx_load
* actually can just import onnx_load instead of onnx.load
* fix external_benchmark_openpilot
* fix onnx_runner test to use onnx_helper
* rerun CI
* try run_modelproto
* spam CI a few times
* revert run_modelproto since that's flaky also
* no external onnx_load usage except onnx.py
* cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why?
* model_benchmark 193s -> 80s, add OnnxRunner.to()...
* minimize diff and clean up
* device can be None, weird but eh
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* clean up unused imports in examples
* enable unused import checking in examples
* lint
* ignore F541 and F841 - focus on unused imports only
* clean up
* restore tinygrad.frontend.torch for TINY_BACKEND
* tiny change
* squash commits
* temp fix for const tensor
* actually realizing float16 can only happen in raw_data
* .float -> cast(float) to rerun CI
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* Don't use numpy inside hlb_cifar10 training loop
* Lint it
* jit it
* Drop the last half-batch
* Use gather for random_crop and reuse perms
* Wrap train_cifar in FUSE_ARANGE context
* No need to pass FUSE_ARANGE=1 to hlb_cifar10.py
* Add cutmix to jittable augmentations
* Remove .contiguous() from fetch_batches
* Fix indexing boundary
---------
Co-authored-by: Irwin1138 <irwin1139@gmail.com>
* move view left to the outer graph
* global view right
* dont need that one
* remove comment
* test kernelize
* simple
* split onnx, test sdxl null
* fix testing
* ugh, wrong one
* Update test.yml
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* schedule to corealize all
* one line sched step
* less lines
* add MobileNetV2 to comma CI
* symlink imagenet
* also the signature
* comment that out
* need imagenetmock
* same train and test set
* quantize on CPU=1
* verbose
* need __hexagon_divsf3
* 0x858d6c15
* quant cpu + CC=clang-19