nimlgen
0f374e10d2
cpu: use mmap for allocations ( #11349 )
...
* cpu: use mmap for allocations
* ops
* fix mypy
2025-07-23 20:30:18 +03:00
George Hotz
ae07a93814
simple block barrier ( #11341 )
...
* simple block barrier
* simple
2025-07-23 10:14:11 -07:00
chenyu
86e7504111
mypy check extra/onnx.py ( #11348 )
...
instead of running test with 3.10, add onnx to mypy which would have caught StrEnum regression. Several type annotation failed mypy now that does not affect running the code and were skipped for now
2025-07-23 12:42:59 -04:00
chenyu
960da9319d
Remove StrEnum in onnx for python 3.10 ( #11345 )
...
some training tests failed looks like parsing error?
2025-07-23 11:52:25 -04:00
qazal
478a355325
gate PRINT_MATCH_STATS behind graph_rewrite tracking ( #11344 )
2025-07-23 16:32:43 +03:00
nimlgen
ca09c180dc
cpu: remove del spam ( #11343 )
...
* cpu: remove del spam
* fix
2025-07-23 12:02:37 +03:00
nimlgen
304eb9cecb
allocate less memory in am tests ( #11342 )
2025-07-23 11:11:26 +03:00
George Hotz
e14b4fefa5
ranges on store ( #11334 )
...
* ranges on store
* fix store spec
* fix that
* fix gates
* fix tests
* fix ptx
2025-07-22 21:00:50 -07:00
George Hotz
c65b5aab62
small things from endrange ( #11339 )
...
* small things from endrange
* store
2025-07-22 19:45:37 -07:00
George Hotz
53339e62f7
no gate store anymore ( #11338 )
...
* no gate store anymore
* fix up spec
2025-07-22 18:41:15 -07:00
chenyu
7a9a5cfd28
isolate test/external/external_test_am.py ( #11335 )
...
seems to be the one crashing, also remove -n=auto for that
2025-07-22 19:02:20 -04:00
George Hotz
fcbd0e4de3
assigns are no longer used [pr] ( #11333 )
2025-07-22 15:35:07 -07:00
George Hotz
09431d4ad1
make DEFINE_REG behave like the others ( #11273 )
...
* simpler define reg
* cast
* PTRCAT define_acc
* cleanups
* fix uops stats
* fix linearizer tests
* llvm
* define reg sets const
* define reg sets const
* no assign
* collapse that
* fix test_max_pool2d_bigger_stride_dilation
* use index, fix webgpu
* devec
* fix tests
* fix webgpu
* fix llvm
* threads for python
* fix ops_python
* only for reg
* acc_half is real now in the emulator
* fix llvm
* fix webgpu init
* fix wgpu test
* fix some tests
* fix ptx
* fix ptx bool acc
* cleanups
* broken, meh. will fix with ENDRANGE
* line count
2025-07-22 13:53:56 -07:00
chenyu
4535908679
update keccak test_long ( #11331 )
...
it should compare with arg "shake_128"
2025-07-22 16:08:01 -04:00
nimlgen
3faa352dcc
am: bump version after mm changes ( #11328 )
2025-07-22 21:54:10 +03:00
George Hotz
affd83961c
small changes from define_reg ( #11327 )
...
* small changes from define_reg
* fix webgpu
2025-07-22 11:11:48 -07:00
nimlgen
53b3d87456
am: use 4-lvl pdir ( #11326 )
2025-07-22 20:58:15 +03:00
chenyu
2d7c28de6a
clean up dup lambdas in helper_test_exception ( #11325 )
2025-07-22 12:21:57 -04:00
chenyu
c6aa8e58ca
fix TestDropoutProbabilityEdgeCases ( #11322 )
2025-07-22 11:13:56 -04:00
chenyu
fb42c84365
merge TestRollEdgeCases into test_ops ( #11321 )
2025-07-22 10:55:57 -04:00
chenyu
1d8b3e9d1c
movementop only Tensor.roll ( #11317 )
...
* movementop only Tensor.roll
* fixed
2025-07-22 10:34:15 -04:00
chenyu
a41140241b
truncate unsigned const in cstyle ( #11318 )
...
it can be a warning or a hard error in clang
PTX and PYTHON also need fix, skipping for now
2025-07-22 08:02:12 -04:00
qazal
6668d6d241
fix word_wrap with newlines in input string [pr] ( #11319 )
2025-07-22 12:03:13 +03:00
qazal
0c4e19f270
hotfix: disable process replay in REMOTE=1 tests ( #11320 )
...
* hotfix: disable process replay in REMOTE=1 tests
* comment
2025-07-22 10:41:58 +03:00
George Hotz
3b674df34b
generic changes from define_reg_2 ( #11315 )
...
* generic changes from define_reg_2
* fix for ptx
* ugh, that one
2025-07-21 15:14:06 -07:00
chenyu
6e9506e6fd
Tensor.roll supports dims=None ( #11313 )
2025-07-21 17:29:23 -04:00
George Hotz
108aac8af4
use AddrSpace instead of local ( #11314 )
...
* use AddrSpace instead of local
* addrspace in test
2025-07-21 14:00:06 -07:00
chenyu
d3a93185a6
clean up test_roll ( #11312 )
2025-07-21 16:00:50 -04:00
George Hotz
532b52fcef
store has a dtype, like assign ( #11309 )
...
* store has a dtype, like assign
* fix upat
* fix test
2025-07-21 12:50:01 -07:00
geohotstan
445ff8de56
ONNX onnx_parser and buffer_parse clean up ( #11000 )
...
* start
* remove onnx.load from compile4 and move np to dropout
* clean up and enable test
* clean up
* move WebGPU ONNX test into MacOS (WebGPU)
* leave test in ONNX (CPU)
* fix raw_data init None, and simplify onnx_runner test a little?
* THESE TESTS ARE SO UGLY UGHH
* need to really think about how to structure the test
* wow LLMs are quite something
* not always on disk now
* also add external data loading test
* cleaner tests
* minimize diff and add const folding tests
* add external data loading too
* whoops add webgpu back.. but why was it not needed in the first place?
* better comment
* move webgpu test to macos(webgpu)?
* llm english so much better than me wow
* trigger CI to check flakiness
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-07-21 15:10:25 -04:00
George Hotz
842184a1ab
rename kernelize to schedule, try 2 ( #11305 )
2025-07-21 11:18:36 -07:00
George Hotz
7e8f5dde74
matmul style is still reshape ( #11308 )
2025-07-21 11:14:57 -07:00
George Hotz
41de76a7fd
put assign and store next to each other [pr] ( #11306 )
2025-07-21 11:07:35 -07:00
nimlgen
de2df92551
hcq: use devices instead of ids in HCQGraph ( #11303 )
...
* hcq: use devices instead of ids in HCQGraph
* fiz
2025-07-21 20:03:12 +03:00
wozeparrot
30ce16a424
feat: failing test for long keccak ( #11292 )
2025-07-21 12:49:23 -04:00
uuuvn
178dbf3f66
Remote scheduler changes ( #11177 )
2025-07-21 09:29:44 -07:00
वेदांत
e368628736
Add amin support to Tensor operations in Torch backend ( #11290 )
...
* intiger div mod fix
* Revert "intiger div mod fix"
This reverts commit d5d2f201bf .
* feat arg_min support
* tets update
* test fix
2025-07-21 09:14:08 -04:00
qazal
5eb54e2499
viz: close event streams before profiler render ( #11300 )
2025-07-21 15:42:31 +03:00
nimlgen
cc3c1e4c14
hcq: move cpu to hcq ( #11262 )
...
* hcq: move cpu to hcq
* import time
* upd
* fix
* windows support
* hm
* cleaner
* fix timer
* fix timing
* std is ns
* skip profiler
* mypy
* cleaner
* cleanups
* after merge
* default is back
2025-07-21 15:10:38 +03:00
nimlgen
816c01c2d4
hcq: default copy_queue_t=None ( #11297 )
2025-07-21 14:45:20 +03:00
qazal
6520a7fcb6
viz: factorize event stream ( #11298 )
2025-07-21 14:42:00 +03:00
nimlgen
9c533e5c38
hcq: cpu prereq ( #11296 )
2025-07-21 13:35:18 +03:00
nimlgen
e87a42e243
hcq: prepare for windows ( #11293 )
...
* hcq: prepare for windows
* comments
2025-07-21 13:08:56 +03:00
nimlgen
df3ba0a7c0
autogen: fix imports in libusb ( #11294 )
2025-07-21 13:04:27 +03:00
nimlgen
dd6a2d432f
hcq: default timestamp metrics is ns ( #11295 )
2025-07-21 12:56:30 +03:00
wozeparrot
53345ef4e2
feat: make ops_disk work on block devices ( #11291 )
2025-07-20 14:39:50 -07:00
qazal
3002c63b1e
process replay: optionally pass tinygrad import error ( #11289 )
...
* process replay: optionally pass tinygrad import error
* gate all tinygrad internals
* s/getenv/os.getenv pre import
* diff
2025-07-20 22:57:56 +03:00
chenyu
9e3a593313
minor kernel.py cleanups [pr] ( #11286 )
2025-07-20 10:15:31 -04:00
quortus
5f17927a87
Shorten UOp.load method ( #11285 )
2025-07-20 13:48:04 +03:00
chenyu
54924f9969
type remove Union and Optional [pr] ( #11283 )
...
use `|` for consistency
2025-07-19 14:05:52 -04:00