7 Commits

Author SHA1 Message Date
George Hotz
0f25b4b289 move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
geohotstan
1e904155e3 Add Onnx Huggingface to test/models/test_onnx.py (#11468)
* BOOM

* cache extra/huggingface/models/

* why max buffer size is not 0

* override MAX_BUFFER_SIZE

* less models

* remove more models and change cache dir to already cached dir

* only metal

* less is more?

* remove check ops

* why is this not setting the ENVVAR

* ughhhhh just test in models

* only cpu and gpu

* only cpu actually

* just override it idk

* final

* move extra dependencies up top

* simplification

* fix print

* make README better

* revert ops_disk fix for now

* clean up test_onnx

* remove testing fashion clip model cuz sloooowwwwww

* actually let METAL run this

* fix comment mistake

* fix download path in run_models

* does this work?

* cleanup setup and teardown

* contextvar like this?

* prove model is cached

* do I need to increment DOWNLOAD_CACHE_VERSION?

* see if cached with incremented DOWNLOAD_CACHE_VERSION

* use warnings to see if the model exists

* revert DOWNLOAD_CACHE_VERSION stuff and clean up

* add retry to download

* nit
2025-08-14 11:16:41 -04:00
geohotstan
5ce278b245 OnnxRunner file as input (#10789)
* file path as input and have parse be in OnnxRunner.__init__

* modelproto_to_onnxrunner -> modelproto_to_runner

* whoops, fix import

* oh flakiness again, is it because it's getting gc-ed?

* small changes

* CI flaky so just move compile4 fix in

* copy typing of onnx_load

* actually can just import onnx_load instead of onnx.load

* fix external_benchmark_openpilot

* fix onnx_runner test to use onnx_helper

* rerun CI

* try run_modelproto

* spam CI a few times

* revert run_modelproto since that's flaky also

* no external onnx_load usage except onnx.py

* cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why?

* model_benchmark 193s -> 80s, add OnnxRunner.to()...

* minimize diff and clean up

* device can be None, weird but eh

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-07-12 14:27:46 -04:00
b1tg
24d328e313 onnx parser (#10435)
* onnx parser

* fix compile, lint

* onnx.load -> onnx_load

* compatible with ModelProto

* fix test external_test_onnx_ops.py

* fix tests

* fix signed int

* reduce to 261 lines

* fix TypeProto.Optional

* debug for _parse_message, add TypeProto.Sequence, cleanup

* onnx_load from Tensor

* remove BufferedReader

* 174 lines and reduce tensor copy

* cleanup

* use onnx_load in external_model_benchmark.py

* fix qcom test

* [onnx] parser support external data

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-09 12:44:28 -04:00
geohotstan
dedff0e96c fix run huggingface onnx debug (#10679) 2025-06-08 00:59:20 -04:00
George Hotz
74d98eafb8 add onnx frontend stub [pr] (#9558) 2025-03-24 12:24:34 +08:00
geohotstan
0bed9b6cd2 benchmark huggingface onnx models (#8493)
* add ability to ORT=1

* test_vs_ort

* useless f

* actually have benchmark take in modelproto for more flexibility in huggingface stuff

* ok runs

* good

* oops fix benchmark_onnx __main__

* 224 as default

* add ORT=1 option to huggingface_onnx

* use Tensor to get_input

* add abilty to do single onnx model testing

* better names

* merge properly...

* copy in onnx_helpers

* better

* decent script

* need to add debug tool first

* new limit usage

* why did narrowing_error come back..

* pretty decent

* revert validate change

* more ops bug fixes

* revert unnecessary changes

* fix InstanceNorm too

* remove op from O4

* minimize diff

* address old feedback

* unsure of this, just revert

* remove that assert

* working attention

* to_python_const Attention

* cant init from np constant so just do this

* final

* fix bug in attention

* attention clean ups

* add hard TODOs and REPOPATH and TRUNCATE envvar

* fix input_ids default value

* final

* fix scatter

* cleaner _prepare_quantize

* use new attention and tempfile for huggingface script

* more stats

* update

* remove outdated code

* big refactor to something usable by CI

* booooooom

* clean up

* update to using yaml as env var input

* add dry run

* try

* valid pad

* use argparser and fix gather bug

* ignore all yaml

* tiny bit more polish

* woah ignoring all yaml was not right

* typo

* decouple huggingface_onnx_run debug run with huggingface_onnx_download

* bug fix for downloading single model

* WOOOO ok much better

* oops argparse 'required' is an invalid argument for positionals

* oops argparse 'required' is an invalid argument for positionals

* add assert

* fix types

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-12 20:13:12 -04:00