Commit Graph

4433 Commits

Author SHA1 Message Date
qazal
84d568d0cc do not import grouper internals in test_schedule [pr] (#10861)
* fix import

* fix test_multitoutput_ast

* fix test_recursive_swizzle

* test_alu_after_copy

* remove that test
2025-06-18 12:47:57 +03:00
qazal
8b879b0314 merge TestTensorUOpSpec with the other spec unittests [pr] (#10860)
* merge TestTensorUOpSpec with the other spec unittests [pr]

* rename to test_uop_spec
2025-06-18 12:12:08 +03:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00
George Hotz
75503955bf simple schedule test [pr] (#10853) 2025-06-17 16:19:27 -07:00
uuuvn
a51f18f8f9 CI flakiness (#10851)
https://github.com/tinygrad/tinygrad/actions/runs/15718103629/job/44292845140?pr=10753#step:4:161
2025-06-17 14:46:30 -07:00
qazal
e77cd81662 time viz (#10763)
* work

* basic stuff

* work

* also reset

* moving through time

* cleanup

* proper zoom

* add livereload.js

pip install livereload
livereload tinygrad/viz

* minor

* fixed width, remove viewbox

* bit of flexbox magic

* show pid/tid

* merge loops

* min-height

* redo some layout stuff

* create cell groups

* text is hard

* javascript Math.min causes "Maximum call stack size"
bert repro: VIZ=1 PYTHONPATH=. DEFAULT_FLOAT=HALF BS=66 GPUS=6 BERT_LAYERS=2 FUSE_ARANGE=1 MODEL=bert python3 examples/mlperf/model_train.py

* fix recursion issue

* no viz/server changes

* fix test_viz

* everything is a g

* text is easy

* no it's still hard

* livereload+notes

* height: 100% fixes the device bug

* start canvas work

* base canvas

* take chrome's stuff

* serve chrome's thing

* fetch traces from get_profile

* remove junk

* remove some more

* bring everything back again

* dispatch resize events

* base ticks

* hook d3.zoom

* zoom on the x axis

* bring filter back, makes ctrl+drag possible

* remove junk

* Revert "remove junk"

This reverts commit 4987e7bec1.

* draws something, the zooms aren't right

* move to canvas

* fix zooming

* Revert "Revert "remove junk""

This reverts commit 5aac2034fb.

* space key resets zoom

* Divide timelines by device on y axis

* Show kernel names when the width allows

* Clicking on kernel opens it in the kernel graph

* remove livereload.js

* reset diff

* base diff:

- fetch traceEvents
- displayGraph
- flexbox layout
- rest of canvas

* rescale in-place is faster, d3's rescaleX creates a copy

* less

* aesthetics

* map names

* first viz is profiler

* this will work when i make canvas once

* initial cleanups

* factor out of loop

* refactor + only show devices with events

* properly align program rects

* cleaner tick lines

* padding

* listen for resize

* simple zoom

* space more

* i always end up making zoom globl

* how is this ever allowed

* clicking works again

* back button goes back to the same zoom level

* work

* more work

* bring that back

* coloring work and simplify

* black

* keep perfetto button for comparison

* better

* ph===X

* simplify history stuff

* temp: handcoded

* test: flamegraph style leveling

* Revert "temp: handcoded"

This reverts commit bdcd538e88.

* disable flamegraph

* group by pid

* factor y and height out of render

* now flamegraph is easy

* livereload stuff

* remove that

* less
2025-06-17 19:39:34 +03:00
George Hotz
531d143780 bring back old sharded rand behavior (#10842) 2025-06-16 17:23:47 -07:00
George Hotz
a493eb396c fix view add 0 (#10840) 2025-06-16 16:46:12 -07:00
George Hotz
e2907360b7 multi is one PM [pr] (#10838)
* multi is one PM [pr]

* disable flaky tests
2025-06-16 14:52:47 -07:00
Sieds Lykles
b1fefb76dd More conditions for (x//c1+a)//c2 -> (x+a*c1)//(c1*c2) (#10834)
* add rule and test

* typo

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-16 16:34:52 -04:00
uuuvn
18d936f981 Remote multihost (#10598) 2025-06-16 13:18:56 -07:00
Sieds Lykles
deb6af0638 Remove incorrect rule for x%-d -> (x%d)*-1 (#10832)
* fix rule and add test

* combine tests
2025-06-16 11:37:44 -04:00
chenyu
e5d5ae55f9 smaller inputs for test_sort and test_topk (#10829) 2025-06-16 00:21:15 -04:00
nimlgen
c0329148c7 am: check va is aligned to page size (#10815)
* am: check va is aligned to page size

* swap them

* is this faster
2025-06-15 22:51:09 +03:00
Sieds Lykles
ac27c46104 fix UPat get_location after mathtraits refactor (#10814)
* fix UPat get_location

* fold line
2025-06-15 12:47:55 -07:00
George Hotz
5dc1bc6070 switch get_kernel -> get_program [pr] (#10817)
* switch get_kernel -> get_program [pr]

* fix tests
2025-06-15 12:26:50 -07:00
Sieds Lykles
37d3ca152e Adapt >> for division by power of two to all ints (#10803)
* Change divison by power of two to always use shift

* Change test to test int instead of uint

* simplify condition

* add old rule back with comment

* remove import

* use sresolve instead of simplify

* use keyword in simplify instead of sresolve

* webgpu cast y to uint

* remove comment

* explicitly set dtype in wgsl

* without simplify

* undo simplify kwarg

* change test to test both int32 and uint32
2025-06-14 14:55:51 -04:00
chenyu
652db5702b move test_conv_shapetracker and some test_search util into unit test (#10812) 2025-06-14 13:29:32 -04:00
leopf
118a09ddcf xor self folding (#10806)
* xor folding

* tests + z3 bitwise xor
2025-06-14 10:01:17 -04:00
chenyu
8c28b5d833 move dtype spec tests into unit test (#10808)
* move dtype spec tests into unit test

can clean up more after the split

* skip CI test_backward_sum_acc_dtype
2025-06-13 22:21:22 -04:00
chenyu
7a6df0a161 remove .relu() call in several conv tests in test_ops (#10807)
* remove .relu() call in several conv tests in test_ops

testing negative parts double the effectiveness. keep the relu between two convs and the tests that explicitly test relu

* relax tol
2025-06-13 17:10:16 -04:00
qazal
a113c5e3ae viz: update browser test to properly shutdown [pr] (#10793)
Using `await page.evaluate` can cause non deterministic `TargetCloseError`
exceptions if it cannot find the elements on the page, Puppeteer
doesn't cleanly stop when `browser.close()` is called.
[Failing CI](https://github.com/tinygrad/tinygrad/actions/runs/15596803685/job/43928961323?pr=10763#step:9:61)
2025-06-12 17:58:42 +03:00
wozeparrot
eb739bb96a hotfix: lower threshold (#10786) 2025-06-11 19:36:20 -04:00
Sieds Lykles
10b61157b9 Support symbolic slice with no start [pr] (#10775)
* add symbolic slice with no start

* reshape the test

* step must be int

* just add a cast...

* more cast...
2025-06-11 16:00:38 -04:00
George Hotz
a38947b4bb move symbolic and transcendental to uop [pr] (#10771) 2025-06-10 20:51:22 -07:00
chenyu
612cdf5146 move fuzz_shape_ops to run with other fuzzer (#10767)
* move fuzz_shape_ops to run with other fuzzer

* don't skip CPU
2025-06-10 17:43:04 -04:00
b1tg
52c49dd4f3 fix onnx ci (#10762)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-06-10 14:28:40 -04:00
chenyu
14fa62c61d move high level tests to unit (#10760)
either no need a backend, or running on one to check suffice
2025-06-10 12:55:44 -04:00
Sieds Lykles
0daa4c6ed0 Add DType.min and DType.max properties (#10749)
* add properties

* cleaner test

* remove added newline
2025-06-10 08:31:34 -07:00
qazal
5d9c274924 keep UOp tags if sources are replaced (#10754)
* keep UOp tags in unified_rewrite

* add failing test, print tag if defined

* remove the repr change
2025-06-10 08:30:14 -07:00
George Hotz
acf72872b3 move view left to the outer graph prereqs + testing (#10725)
* move view left to the outer graph

* global view right

* dont need that one

* remove comment

* test kernelize

* simple

* split onnx, test sdxl null

* fix testing

* ugh, wrong one

* Update test.yml
2025-06-09 20:43:25 -07:00
chenyu
b7198fdcfd linearizer failure from wino fuse arange cifar (#10739) 2025-06-09 23:10:19 -04:00
George Hotz
81ef879da3 non recursive top_down_rewrite (#10729)
* non recursive top_down_rewrite

* nicer algorithm

* rewrite bottom up also

* only top down is broken?

* simpler iterative algo

* no recursion errors

* top down and bottom up

* unified rewrite

* simpler rewrite

* clean up comments

* move that comment
2025-06-09 16:33:04 -07:00
chenyu
53cbd4254b suppress filter_too_much on test_float_cast_to_unsigned (#10733)
falky, already done in test_float_cast_to_unsigned_overflow and test_float_cast_to_unsigned_underflow
2025-06-09 18:30:04 -04:00
chenyu
55cdbb9a20 fix mask in expand into symbolic size (#10730)
failed before when old size is 1 and it expands into symbolic size, because `resolve(s != ns, False)` is False and it does not expand the mask
2025-06-09 17:33:22 -04:00
wozeparrot
926b11381c failing test for symbolic expand after pad (#10727)
* feat: failing test for symbolic expand after pad

* feat: mark test as failing
2025-06-09 16:55:21 -04:00
chenyu
49f999d919 update _reshape_mask for symbolic shape expand (#10726)
* don't merge shape symbolic reshape symbolic

* proper fix
2025-06-09 16:35:02 -04:00
wozeparrot
27dd97f688 support variable shape none slice in getitem (#10724) 2025-06-09 11:53:02 -07:00
George Hotz
f84c320548 better external_benchmark_schedule [pr] (#10722) 2025-06-09 10:26:11 -07:00
b1tg
24d328e313 onnx parser (#10435)
* onnx parser

* fix compile, lint

* onnx.load -> onnx_load

* compatible with ModelProto

* fix test external_test_onnx_ops.py

* fix tests

* fix signed int

* reduce to 261 lines

* fix TypeProto.Optional

* debug for _parse_message, add TypeProto.Sequence, cleanup

* onnx_load from Tensor

* remove BufferedReader

* 174 lines and reduce tensor copy

* cleanup

* use onnx_load in external_model_benchmark.py

* fix qcom test

* [onnx] parser support external data

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-09 12:44:28 -04:00
George Hotz
81b9c04574 move high level stuff to unit tests [pr] (#10708)
* move high level stuff to unit tests [pr]

* process replay on unit tests

* fix pr, less compute

* set omp num threads

* set 200MB buffer size limit

* delete junk

* fix tests

* faster

* move test_indexing to unit

* faster
2025-06-08 14:05:56 -07:00
George Hotz
4e2c3560b4 smaller tests are faster tests [pr] (#10704)
* remove del spam from CI

* more

* preconstruct default buffer spec

* ignore those errors

* check exception

* more exception check

* skip stuff

* smaller tests mean faster tests

* a few more
2025-06-08 10:54:19 -07:00
George Hotz
32e9949052 rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
uuuvn
8e3f337075 Skip flaky test in ci (#10696)
`test_data_parallel_resnet_train_step` is already skipped on LLVM/CPU:

```python
@unittest.skipIf(CI and REAL_DEV in ("CUDA", "NV", "LLVM", "CPU"), "slow, and flaky on LLVM/CPU")
@unittest.skipIf(REAL_DEV == "WEBGPU" and not OSX, "WEBGPU Vulkan can only run kernels with up to 10 buffers")
def test_data_parallel_resnet_train_step(self):
```

It looks like `test_data_parallel_resnet` (no `_train_step`) is flaky in a similar way:
https://github.com/tinygrad/tinygrad/actions/runs/15472667248/job/43560773882?pr=10642#step:9:64
2025-06-08 08:24:09 -07:00
George Hotz
8c76250d31 speed up a few tests (#10692) 2025-06-07 20:39:25 -07:00
ihar
40c1479267 added unit tests for 'argfix' (#10678) 2025-06-07 22:17:10 -04:00
ihar
74b849b5e1 remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape' (#10677)
* remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape'

* added the same set of unit tests for 'view' as for 'reshape' since 'view' is just an alias for 'reshape'

* improved tests for 'view' op
2025-06-07 22:15:31 -04:00
Sieds Lykles
c29a56dd51 Fix whisper OOB (#10685)
* fix whisper and test

* remove import
2025-06-07 20:23:50 -04:00
George Hotz
53ed64e133 ci speed work 1 (#10676)
* skip a few slow tests

* use a venv for python packages

* create venv

* no user, it's in venv

* ignore venv

* venv

* new cache key

* try that

* this

* version the python cache
2025-06-07 16:33:11 -07:00
qazal
cb61774ab6 move shared viz fields out of serve.py [pr] (#10684)
* move shared viz fields out [pr]

* update javascript

* update test_viz
2025-06-07 17:18:18 +03:00