Commit Graph

2848 Commits

Author SHA1 Message Date
chenyu
773d5b60bf beam benchmark tests (#7638)
* beam benchmark tests

* lower AMD number somehow

* less flaky
2024-11-11 18:11:18 -05:00
chenyu
bfab03288d fix HALF=1 in test_speed_v_torch (#7642)
* fix HALF=1 in test_speed_v_torch

"operation cache defeats" adds 1 to all arg, which were centered around 0. adding 1 makes big matmul and matvec go inf.

fixed by subtract 1 after and bumpped tolerance for half input

* bigger tol for BIG=2, update CI too

* bigger tol
2024-11-11 14:29:37 -05:00
nimlgen
4d81b7952a qcom match texture/sampler descriptors to OpenCL (#7622)
* qcom ioctl compare more regs

* bug fix
2024-11-11 21:56:51 +03:00
George Hotz
d40673505f new cloud is cloudy [pr] (#7631)
* new cloud is cloudy [pr]

* waste lines to add security

* safety, with speed and less lines

* timing and del

* lines

* cleanups

* restore CloudSession

* bump to 3.10

* quotes

* renderer security
2024-11-11 20:18:04 +08:00
George Hotz
bbc64bf305 x|(x&y) -> x (#7629)
* x|(x&y) -> x

* fix tests
2024-11-11 10:00:18 +08:00
uuuvn
94a484542b Hook memoryview via class instead of a function (#7627) 2024-11-11 09:07:06 +08:00
qazal
a8da84cce0 recursive swizzle with just graph_rewrite [pr] (#7626) 2024-11-10 20:14:21 +02:00
qazal
092a441748 test swizzle post permute (#7623)
* test swizzle post permute

* add st_fixup assert
2024-11-10 16:18:22 +02:00
George Hotz
745316493c hotfix: add test_simple_conv2d_bias 2024-11-10 18:36:42 +08:00
George Hotz
0a411b4f68 replace llvm with new llvm (#7616)
* replace llvm with new llvm

* fix test_linearizer

* minor fixups

* fix alloca

* don't use alloca

* fix DEFINE_ACC

* lines

* comments and lines

* a little tighter
2024-11-10 11:28:52 +08:00
qazal
b61266eb97 late fusion spec for big graph [pr] (#7613) 2024-11-09 23:43:11 +08:00
qazal
9d6b03d691 early assert swizzle in kernel [pr] (#7610)
* early assert swizzle in kernel [pr]

* better

* note changes

* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
8ca422e21a script to compare kernel opt with BEAM (#7604)
intersting that on m1 max hcopt wins BEAM 2 about 20% of the time
2024-11-08 17:40:28 -05:00
chenyu
573f145dcf METAL raise RuntimeError with no compiler and bad src (#7603)
fixed BEAM if src is invalid on METAL. it currently only accept RuntimeError in `_time_program`
2024-11-08 17:09:12 -05:00
chenyu
74b4d1c1e1 rewrite idx again in real_strides after uop_given_valid (#7600)
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1 simplify_valid in real_strides (#7599)
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
Ahmed Harmouche
e35226e698 Remove Ops.ALU (#7595) 2024-11-08 19:52:14 +08:00
Harald Schäfer
e7cbc29f48 openpilot benchmark: add cast from numpy to benchmark (#7593)
* openpilot benchmark: add cast from numpy to benchmark

* whitespace

* comment
2024-11-08 19:31:00 +08:00
chenyu
a1dfd288bb different valid order (#7589)
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
4378b100ad make UOp.range arg a tuple [pr] (#7583)
* make UOp.range arg a tuple [pr]

so render works on output of ShapeTracker.to_indexed_uops

* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be uop_given_valid in real_strides (#7231)
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
uuuvn
c846dd70b2 Increase test tolerance for probabilistic test (#7580) 2024-11-07 09:35:11 -05:00
George Hotz
205befa788 move is_dtype_supported to device [pr] (#7575) 2024-11-07 20:38:03 +08:00
qazal
1f5ea1e412 late fusion tests, early merge view GroupOp.Buffer [pr] (#7577)
* test_late_fusion_double_transpose

* early merge view buffer ops
2024-11-07 20:04:57 +08:00
qazal
f0fc34e594 swizzle tests from the delete_fuse branch [pr] (#7576)
* swizzle tests from the delete branch [pr]

* actually test torch

* atol
2024-11-07 18:29:06 +08:00
chenyu
a011562450 fix view add with symbolic shape (#7569)
the issue is that the symbolic shape is not greedily simplified and canonicalized before reshape
2024-11-06 11:39:20 -05:00
qazal
6a19ca81c9 failing test for View.__add__ RecursionError (#7567)
* failing test for View.__add__ RecursionError

* move to test_symbolic_shapetracker
2024-11-06 23:46:47 +08:00
qazal
a9a040398c don't print the entire schedule on assert [pr] (#7565)
* don't print the entire schedule on assert [pr]

* extra
2024-11-06 18:29:50 +08:00
chenyu
c805e3fff5 skip test_jit_batch_split if JIT >= 2 (#7561)
* skip test_jit_batch_split if JIT >= 2

only test graphs

* 1600
2024-11-05 14:59:04 -05:00
chenyu
f2fa183651 increase threshold test_strongly_connected_DAG (#7560)
it shoult test some other properties. flakying with time test https://github.com/chenyuxyz/tinygrad/actions/runs/11688403523/job/32548762512
2024-11-05 11:44:39 -05:00
Carl Basho
630a7f37cf update tests (#7554)
Co-authored-by: John Doe <null@mail.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-05 11:35:15 -05:00
chenyu
207bca6cea set PAGE_SIZE=1 and generate new dataset (#7559)
13080 rows in total. both generating and loading this are pretty broken now. filters are wrong for example
2024-11-05 11:25:01 -05:00
geohotstan
934fb73994 fix test_schedule conv2d bug (#7549)
* tests tests tests

* slap a resolve on it

* fix comment
2024-11-05 09:07:25 -05:00
George Hotz
99bd4372a5 Ops.ALU is no more, the arg is just an op (#7525)
* op arg alu [pr]

* more

* more passing

* fix more tests

* more tests passing

* fix single failing test

* so much cleaner

* noop to not have process replay trigger

* fix ptx
2024-11-05 00:22:22 +08:00
Ahmed Harmouche
36488a2a43 Use is_dtype_supported in more places in tests (#7529) 2024-11-04 09:21:15 -05:00
qazal
b5718ae135 image dtype fusion tests [pr] (#7530)
* update test_lil_model

* add test_image_matmul
2024-11-04 22:00:16 +08:00
George Hotz
9c3ee64a3e hotfix: QoL assert if op is a str 2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b rename ops to have unique names (#7522) 2024-11-04 17:09:45 +08:00
George Hotz
9a7cc04843 fix viz [pr] (#7519)
* fix viz [pr]

* Update serve.py
2024-11-04 15:02:41 +08:00
George Hotz
6bb230287b pass the src into Metal [pr] (#7518)
* pass the src into Metal [pr]

* put that comment back

* keep old functionality

* move all to disassembler

* metal supports parallel beam

* touchups

* comment in correct place
2024-11-04 12:35:30 +08:00
George Hotz
bac251d2c1 idx_load_store in lowerer [pr] (#7477)
* idx_load_store in lowerer [pr]

* fix tests (#7513)

Co-authored-by: John Doe <null@mail.com>

* work

---------

Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com>
Co-authored-by: John Doe <null@mail.com>
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b Revert "s/UPat/Pat (#7506)" [pr] (#7517)
* Revert "s/UPat/Pat (#7506)"

This reverts commit 400011a8c1.

* fix
2024-11-03 16:33:02 -05:00
chenyu
df49439b9a remove reassoc from LLVM flags (#7512)
reassoc reorders compute and breaks transcendental
2024-11-03 13:11:56 -05:00
chenyu
2f70fb893e move transcendental fuzzer test to test_transcendental (#7511) 2024-11-03 12:36:50 -05:00
chenyu
84592225d8 tweak tqdm (#7510)
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e fix tqdm tests (#7509)
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
4617c9a565 move COMMUTATIVE flipping to symbolic (#7507)
* move COMMUTATIVE flipping to symbolic

it cannot go with TRANSCENDENTAL

* skip LLVM
2024-11-03 09:03:45 -05:00
chenyu
400011a8c1 s/UPat/Pat (#7506) 2024-11-03 08:26:19 -05:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
chenyu
91a3b27fa9 disable test_setitem_inplace_operator again (#7495)
it was flaky, not broken broken
2024-11-02 19:01:23 -04:00