* file path as input and have parse be in OnnxRunner.__init__
* modelproto_to_onnxrunner -> modelproto_to_runner
* whoops, fix import
* oh flakiness again, is it because it's getting gc-ed?
* small changes
* CI flaky so just move compile4 fix in
* copy typing of onnx_load
* actually can just import onnx_load instead of onnx.load
* fix external_benchmark_openpilot
* fix onnx_runner test to use onnx_helper
* rerun CI
* try run_modelproto
* spam CI a few times
* revert run_modelproto since that's flaky also
* no external onnx_load usage except onnx.py
* cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why?
* model_benchmark 193s -> 80s, add OnnxRunner.to()...
* minimize diff and clean up
* device can be None, weird but eh
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* start LLM app, tons of clean up required. target is 200 line ollama
* kind of works
* simpler
* add k/v cache
* with SYM=1, it loops
* no rope cache
* simpler
* more cleanups
* cleanups
* works
* argparse and comments
* from gguf
* generate is a function
* no copy from cpu
* fix max context pass in
* test
* improve test
* ai2_arc
* fix 8B, use less ram
* 136 lines
* merge view infinite loop test
* adjust condition in `x//d -> x//(-d)*-1`
* Fix division by zero in add views
* adjust offset end
* fix typo in comment
* add target to test_merge_views_variable
* fix view incorrectly being masked
* ssimplify strides and offset of the new view to canonicalize
* remove print in test
---------
Co-authored-by: qazal <qazal.software@gmail.com>
* print inputs to get_program in process replay [pr]
* colors
* keep dataclass default escapes
* Revert "keep dataclass default escapes"
This reverts commit c6db7e8a7a.
* note for ast_repr
* add that back
* inital commit
* add qr + expand svd to full matrix
* add odd number support
* add linalg tests
* qr supports dims of arbitrary size
* add qr tests
* svd supports dims of arbitrary size
* small cleanip
* improvements over svd batch handling
* improve linalg tests
* make u_pad match q shape
* add nonfull matrix tests
* little less verbose nonfull svd test
* added dtypes on svd + return vt instead of vt
* lint
* more lint
* lint + set seed
* small fix
* small lint
* lint
* add int casting to indices and shapes
* remove int from shape tuple in svd
* small cleanup
* add return types
* reuse inverse_permute
* refactoring
* whitespace
* remove regularization term to prevent bad outputs on ill conditioned matrices
* remove seed
* refactor
* lint
* refactor
* spacing
* remove clone
* line reduction
* smarter heuristic for iterations_per_round
* add big test
* lint
* turns out no constant needed?
* wrap tests
* some small matrices need the constant
* remove realize
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>