wozeparrot
a18963d9e7
feat: use tinygrad useragent ( #10488 )
2025-05-23 15:44:40 -07:00
qazal
7a762f01ab
s/shape_spec/ast_spec [pr] ( #10485 )
2025-05-23 15:43:54 +03:00
qazal
127a7c8aee
assert AST views only exist in the edges ( #10484 )
...
* assert AST views only exist in the edges
* valid without device
2025-05-23 15:27:09 +03:00
qazal
e491168685
add metadata note + whitespace fixup [pr] ( #10483 )
...
* add metadata note + whitespace fixup [pr]
* TestSchedule.test_kernelize_diamond
2025-05-23 14:37:45 +03:00
Sieds Lykles
ce6ebfb8ee
verify rewrites in test_uop_symbolic ( #10430 )
...
* verify rewrites in test_uop_symbolic
* use global context
2025-05-23 06:57:29 -04:00
George Hotz
1e4d63e06e
uops can have multiple metadata ( #10479 )
...
* uops can have multiple metadata
* fixups
2025-05-22 21:35:02 -07:00
George Hotz
9fc01c1e03
support for uop tags ( #10477 )
...
* support for uop tags [pr]
* test uop tags
2025-05-22 19:53:48 -07:00
chenyu
8cc2dff4d8
only float Tensors have gradient [pr] ( #10475 )
2025-05-22 21:02:11 -04:00
George Hotz
147f7747f2
remove the map from create_schedule_with_vars [pr] ( #10472 )
2025-05-22 15:58:25 -07:00
George Hotz
0d39bb5de1
rename to get_kernelize_map ( #10465 )
2025-05-22 11:44:44 -07:00
chenyu
7bfb20757c
fix tensor int floor div ( #10327 )
...
* fix tensor int floor div
* test_float_floordiv_scalar
2025-05-21 06:46:54 -04:00
Sieds Lykles
2b4375f36d
Correct divmod folding behind flag ( #10433 )
...
* add flag
* add test
* remove import
2025-05-21 06:46:13 -04:00
qazal
df4cbb69e9
move fuzz_schedule.py to extra [pr] ( #10444 )
2025-05-21 10:07:24 +03:00
chenyu
29624af872
skip commavq in external_model_benchmark ( #10439 )
...
precision issue with different onnxruntime version
2025-05-21 01:45:33 -04:00
George Hotz
03e7a99ca8
add edge cases found by codex [pr] ( #10423 )
...
* add edge cases found by codex [pr]
* another test
* more edgecases
* docs
* instructions
* fine, add that one
* nan cases
* roll failures
* inv prob
* more failing tests
* err, that's failing
* more tests
* more failures
* uop verif
* failures
* webgpu
2025-05-20 14:53:18 -07:00
nimlgen
2895198c36
am: download regs ( #10419 )
...
* am: download regs
* x
* linter
* mypy
* after merge
* raise
* fixed name
* fix
* xx
* remove
* missing reg
* missing reg
* move to online
* ops
2025-05-20 18:59:56 +03:00
uuuvn
ec9955c956
Use REAL_DEV for test skips ( #10420 )
...
This should fix remote cpu tests flakiness (segfaults were in
`test_data_parallel_resnet_train_step` which is skipped on cpu but wasn't
skipped on remote cpu)
2025-05-19 17:32:14 -07:00
Sieds Lykles
db09676250
Dont simplify gate in gate, fix FUSE_ARANGE=1 python test/test_ops.py TestOps.test_scatter_add ( #10411 )
...
* substitute out index
* Add test
* change comment
2025-05-19 13:16:21 -04:00
qazal
cc8dda1d75
move multi_map to grouper rewrite pass ( #10409 )
...
* move multi_map to grouper rewrite pass
* delete that
2025-05-19 10:44:06 +03:00
George Hotz
b06291077c
no amdgpu kernel driver ( #10408 )
...
* no amdgpu kernel driver
* don't test hip
* lower req
2025-05-18 20:52:39 -07:00
George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
uuuvn
27c12be471
amd mockgpu graph support ( #10385 )
...
For testing remote graph stuff (prompted by #10371 ) in ci
2025-05-18 09:43:16 -07:00
qazal
04b23087d8
grouper tests from fuse_arange_default [pr] ( #10394 )
2025-05-18 18:42:43 +03:00
qazal
9e2089dcd4
don't raise Exception in process replay [pr] ( #10392 )
...
* don't raise Exception in process replay [pr]
* continue generating diffs unless [pr] is set, exit(1) otherwise
* change
* works
2025-05-18 11:23:23 +03:00
qazal
0294bfe507
simpler can_pad ( #10364 )
...
* simpler can_pad [pr]
* 3 kernels
* tests
* less kernels
2025-05-18 10:00:07 +03:00
George Hotz
6f77b938d7
Move getbits tests into test_helpers ( #10382 )
2025-05-17 17:04:00 -07:00
George Hotz
6ec88d94df
add tests for multi ram usage [pr] ( #10376 )
2025-05-17 15:33:40 -07:00
वेदांत
2453d99050
rms matching pytorch implementation ( #10319 )
...
* rms matching pytorch implementation
* pre commit fix
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-05-17 08:23:11 -07:00
qazal
e054b53a75
kernel count tests for pad [pr] ( #10369 )
...
* kernel count tests for pads
* handcoded rand one kernel
* comment
* prerealize device rng counter
* test_rand_handcoded generates /0
* remove track_rewrites
2025-05-17 17:20:46 +03:00
George Hotz
e13f2a3092
multi is O(1) ( #10183 )
...
* multi is O(1)
* allreduce
* no new uops needed
* junk
* something
* simple
* that's really what i want
* closer
* inject _device_num
* pretty print
* cleanups
* this
* early dnum
* ops allreduce is good
* ish
* device is the tuple and this is fine
* simpler
* progress
* copy_multi
* work
* more tests
* more tests pass
* work
* no None axis
* tests
* no none multi
* type fixes
* pre commit passes
* lil
* remove this
* mlperf dataloader on mac
* that test was wrong
* unbind
* support DEBUG=2
* realize
* only unbind bound vars
* don't include fixedvars
* graph test
* one test
* fixedvars in hcq
* new ring reduce
* ring reduce
* simpler ring
* mselect
* mselect doesn't work
* Revert "mselect doesn't work"
This reverts commit c78b77bd7d .
* Revert "mselect"
This reverts commit bb2e430ac3 .
* simpler
* fixups
* no optional
* fix jit
* move things around
* cleanup multi
* simpler multi
* simpler reshape
2025-05-16 23:14:23 -07:00
George Hotz
e1a40e8040
add hcq fixedvars support [pr] ( #10356 )
...
* add hcq fixedvars support [pr]
* different test
* fixedvars are only for comp_queues
* fix hcq varvals
2025-05-16 22:05:53 -07:00
George Hotz
876d2275a1
changes from new multi ( #10353 )
...
* changes from new multi
* revert hcq change
2025-05-16 13:07:29 -07:00
wozeparrot
66e00c04dd
fix: skip kernel timing tests on ci cuda ( #10348 )
2025-05-16 11:48:06 -07:00
qazal
e9e5b54e43
grouper cleanups and merge with insert_kernels [pr] ( #10349 )
...
* grouper cleanups and merge with insert_kernels [pr]
* remove that
2025-05-16 14:39:56 +03:00
b1tg
caded2f413
llvm diagnostic error ( #10267 )
...
* llvm diagnostic info
* use decorator
* better error reporting
* fix mypy
* collect all diag msgs
* test diag error
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-05-16 02:03:20 -04:00
George Hotz
a4a25720b2
add test_multitensor_jit_input [pr] ( #10347 )
2025-05-15 20:47:57 -07:00
wozeparrot
1ed04f993b
move benchmark stat tracking to influxdb ( #10185 )
2025-05-15 16:14:56 -07:00
wozeparrot
f59ecf2116
fix: mockgpu cuda timing ( #10343 )
2025-05-15 14:14:14 -07:00
qazal
7cfe367c07
failing test for slow embedding kernel with FUSE_ARANGE=1 [pr] ( #10330 )
2025-05-15 14:58:11 +03:00
qazal
0a45cd0cbe
grouper: merge views in fuse elementwise ( #10325 )
...
* grouper: merge views in fuse elementwise
* with gradient api
2025-05-15 13:17:09 +03:00
qazal
89d8d5b25e
add dims check in FUSE_ARANGE ( #10323 )
2025-05-15 11:33:21 +03:00
qazal
8fad0f0124
grouper: check for unsafe PAD in FUSE ( #10322 )
2025-05-15 10:53:44 +03:00
chenyu
f008e5f233
test_dtype_alu should cast bf16 input ( #10320 )
...
when testing alu for bfloat16, it should cast inputs to bfloat16 first, otherwise numpy has both errors from input and errors from alu which is more inaccurate
2025-05-15 01:11:39 -04:00
George Hotz
568d6d96e7
small changes from new multi [pr] ( #10318 )
2025-05-14 20:50:59 -07:00
chenyu
f6cf25fce4
cleanup test_conv2d_ceildiv_edge_case [pr] ( #10317 )
2025-05-14 23:35:28 -04:00
Kirill R.
50d7162acd
Add conv2d ceildiv edge case ( #10303 )
2025-05-14 22:50:23 -04:00
wozeparrot
9bbc2bc2a7
hotfix: filter_too_much ( #10308 )
2025-05-14 15:31:51 -07:00
George Hotz
42e70193c9
multi: instead of real, just copy ( #10289 )
...
* multi: instead of real, just copy
* fix test
* remove real
2025-05-14 10:36:55 -07:00
qazal
043efc6ec4
do not require self for track_rewrites [pr] ( #10302 )
2025-05-14 18:23:32 +03:00
qazal
d342f7688d
remove some skips in test_schedule + use assertRaisesRegex [pr] ( #10296 )
2025-05-14 14:54:07 +03:00