George Hotz
744af193f0
remove ScheduleItem and merge it with ExecItem ( #13759 )
...
* remove ExecItem and merge it with ScheduleItem
* less diff
* fix issues
* min diff
* don't change bufs in _lower
* min diff
* update
* revert
* fixes
* diff
2025-12-19 17:04:24 -04:00
qazal
366badaa68
require renderer argument in get_program, removes device opening in process replay [pr] ( #13524 )
2025-12-03 02:05:31 +08:00
chenyu
285534ce64
delete DONT_REALIZE_EXPAND and DONT_GROUP_REDUCES ( #12744 )
...
does nothing now
2025-10-16 14:11:33 -04:00
George Hotz
0f25b4b289
move frontend dir to nn [pr] ( #12470 )
2025-10-07 10:42:22 +08:00
b1tg
42748ccb92
rangeify: fix test_prequant_conv2d_1x1 ( #12391 )
2025-10-01 02:33:47 -04:00
Sieds Lykles
b98f1881ef
dsp opt test has different axis number on rangeify ( #12309 )
2025-09-27 05:06:11 +02:00
George Hotz
ee4f696086
delete more tests ( #12043 )
...
* delete more tests
* delete and simplify
* flaky on windows
* a few more, those remained
2025-09-05 15:31:30 -07:00
George Hotz
4b3fcb4064
Revert "REDUCE_AXIS keepdim=False ( #11311 )" ( #11718 )
...
This reverts commit b518a7378a .
2025-08-18 13:28:53 -07:00
b1tg
b518a7378a
REDUCE_AXIS keepdim=False ( #11311 )
...
* progress
* fix tests
* fix tests
* remove hack for test_symfold
* fix test_conv.py on llvm
* hack test_cache_speed
* lint
* remove hack for helper_linearizer_opt
* tests
* fix DSP
* clean up
* remove hack for kernelize.py
* hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none
* clean
* uop.r need reshape?
* lower_store cause fail
* fix lower?
* avoid contiguous hack
* 2134
* conv2d count
* remove unused
* hack lower
* reduced and clean up
* fix TestMultiTensor.test_matmul_shard_none
* src sync + fix TestMultiTensor.test_matmul_shard_none
* remove excluded in mop
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com >
2025-08-18 10:09:17 -07:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
21570545d3
move view pushing to codegen, try 2 ( #11534 )
...
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
2025-08-06 15:58:38 -07:00
George Hotz
09431d4ad1
make DEFINE_REG behave like the others ( #11273 )
...
* simpler define reg
* cast
* PTRCAT define_acc
* cleanups
* fix uops stats
* fix linearizer tests
* llvm
* define reg sets const
* define reg sets const
* no assign
* collapse that
* fix test_max_pool2d_bigger_stride_dilation
* use index, fix webgpu
* devec
* fix tests
* fix webgpu
* fix llvm
* threads for python
* fix ops_python
* only for reg
* acc_half is real now in the emulator
* fix llvm
* fix webgpu init
* fix wgpu test
* fix some tests
* fix ptx
* fix ptx bool acc
* cleanups
* broken, meh. will fix with ENDRANGE
* line count
2025-07-22 13:53:56 -07:00
chenyu
ec3efd2919
move upcast before reduce ( #11250 )
...
* move upcast before reduce
upcast goes to end of global+local+upcast
* r_196_32_4_24_8
2025-07-18 14:42:15 -04:00
geohotstan
536b254df4
Bump onnx to 1.18.0 ( #11266 )
...
* bump
* thou hast implement functions
* hacked in domain support
* some clean ups
* hack quantize_onnx_test too
* add helper lol, why onnx tests why
* better dispatcher, but need tests and better naming
* flaky ci
* change some names
* small clean ups
* make it easier to clean up tests once ORT supports 1.18.0
* nits
* fix bug of Softmax_1 being registered in onnx_ops
* need a default value
* resolve_const is better name
* fix OnnxRunner.to
* use proper domain names
2025-07-17 15:35:41 -04:00
chenyu
a0438012af
remove Kernel.get_program [pr] ( #11203 )
2025-07-12 20:50:29 -04:00
geohotstan
5ce278b245
OnnxRunner file as input ( #10789 )
...
* file path as input and have parse be in OnnxRunner.__init__
* modelproto_to_onnxrunner -> modelproto_to_runner
* whoops, fix import
* oh flakiness again, is it because it's getting gc-ed?
* small changes
* CI flaky so just move compile4 fix in
* copy typing of onnx_load
* actually can just import onnx_load instead of onnx.load
* fix external_benchmark_openpilot
* fix onnx_runner test to use onnx_helper
* rerun CI
* try run_modelproto
* spam CI a few times
* revert run_modelproto since that's flaky also
* no external onnx_load usage except onnx.py
* cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why?
* model_benchmark 193s -> 80s, add OnnxRunner.to()...
* minimize diff and clean up
* device can be None, weird but eh
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-07-12 14:27:46 -04:00
George Hotz
be53ef4f0a
rename DEFINE_ACC -> DEFINE_REG ( #11006 )
...
* rename DEFINE_ACC -> DEFINE_REG
* add CMPEQ to groupops
2025-06-27 11:09:25 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
b1tg
24d328e313
onnx parser ( #10435 )
...
* onnx parser
* fix compile, lint
* onnx.load -> onnx_load
* compatible with ModelProto
* fix test external_test_onnx_ops.py
* fix tests
* fix signed int
* reduce to 261 lines
* fix TypeProto.Optional
* debug for _parse_message, add TypeProto.Sequence, cleanup
* onnx_load from Tensor
* remove BufferedReader
* 174 lines and reduce tensor copy
* cleanup
* use onnx_load in external_model_benchmark.py
* fix qcom test
* [onnx] parser support external data
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-06-09 12:44:28 -04:00
qazal
5b59728c75
refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) ( #10541 )
...
* changes to core tinygrad
* fixups pt1
TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule
* more tests
* green now
* images stay images
2025-05-30 14:27:58 +03:00
chenyu
7bfb20757c
fix tensor int floor div ( #10327 )
...
* fix tensor int floor div
* test_float_floordiv_scalar
2025-05-21 06:46:54 -04:00
George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
chenyu
f5256e0020
Kernel.apply_opts [pr] ( #9917 )
...
* Kernel.apply_opts [pr]
updated all `for opt in`. also updated a few test_liinearizer tests to not implcitly depend on hand_coded_optimization
* not you yet
2025-04-17 08:00:56 -04:00
George Hotz
74d98eafb8
add onnx frontend stub [pr] ( #9558 )
2025-03-24 12:24:34 +08:00
George Hotz
8e555c586c
switch quantization to unsigned/unsigned + add Ops.REDUCE ( #9527 )
...
* switch quantization to unsigned/unsigned + add Ops.REDUCE
* tests
* nhwc + replay pkl
2025-03-21 17:02:37 +08:00
George Hotz
cb7a7f69c7
quantization preprocessor from DSP, should be universal ( #9437 )
...
* quantization preprocessor from DSP, should be universal
* touchups
* fix tests
2025-03-15 07:49:37 +08:00
chenyu
01e8b60911
acc_dtype -> dtype ( #9402 )
...
matched numpy and torch
2025-03-10 16:05:30 -04:00
George Hotz
9289425170
add ast to ProgramSpec + pre matcher [pr] ( #9128 )
...
* add ast to ProgramSpec + pre matcher [pr]
* cleaner cast + test fix
2025-02-17 16:39:14 +08:00
George Hotz
4672d9af73
actual tests for the dsp backend [pr] ( #9102 )
...
* actual tests for the dsp backend [pr]
* fix name
2025-02-15 15:17:56 +08:00
George Hotz
0568720a68
delete revectorize ( #9000 )
...
* delete revectorize
* test vectorized LLVM/CLANG
* idk about that
* was that the segfault?
2025-02-10 18:32:35 +08:00
George Hotz
2983285315
use HEX_REG_QEMU_INSN_CNT from qemu as a DSP timer [pr] ( #8993 )
...
* use HEX_REG_QEMU_INSN_CNT from qemu as a DSP timer [pr]
* add quantize test to dsp
* fix tests
* older onnx
* debug, let's see what's happening
2025-02-10 11:07:35 +08:00