chenyu
773d5b60bf
beam benchmark tests ( #7638 )
...
* beam benchmark tests
* lower AMD number somehow
* less flaky
2024-11-11 18:11:18 -05:00
chenyu
bfab03288d
fix HALF=1 in test_speed_v_torch ( #7642 )
...
* fix HALF=1 in test_speed_v_torch
"operation cache defeats" adds 1 to all arg, which were centered around 0. adding 1 makes big matmul and matvec go inf.
fixed by subtract 1 after and bumpped tolerance for half input
* bigger tol for BIG=2, update CI too
* bigger tol
2024-11-11 14:29:37 -05:00
nimlgen
4d81b7952a
qcom match texture/sampler descriptors to OpenCL ( #7622 )
...
* qcom ioctl compare more regs
* bug fix
2024-11-11 21:56:51 +03:00
qazal
0b66a0d688
only lookup buf_uops in fuse.py [pr] ( #7641 )
2024-11-11 19:14:30 +02:00
qazal
08b9f055f2
don't need outputs in fuse.py [pr] ( #7639 )
2024-11-11 18:35:31 +02:00
George Hotz
b4cb6b89f9
hotfix: CI mac uses python 3.11
2024-11-11 23:42:35 +08:00
George Hotz
9648372ee6
hotfix: mac uses python 3.12
2024-11-11 23:23:48 +08:00
George Hotz
aaa8059aec
python 3.10 is minimum [pr] ( #7636 )
2024-11-11 23:05:50 +08:00
Kinvert
6a0ed46b1c
adding viz to env_vars docs ( #7630 )
2024-11-11 21:28:27 +08:00
George Hotz
d40673505f
new cloud is cloudy [pr] ( #7631 )
...
* new cloud is cloudy [pr]
* waste lines to add security
* safety, with speed and less lines
* timing and del
* lines
* cleanups
* restore CloudSession
* bump to 3.10
* quotes
* renderer security
2024-11-11 20:18:04 +08:00
qazal
766a680588
swizzle parents with graph rewrite ( #7625 )
...
* delete st_fixup
* refactor
* minimal diff
2024-11-11 16:50:38 +08:00
qazal
fec977b966
calling view on graph edges is fine [pr] ( #7632 )
2024-11-11 16:35:18 +08:00
George Hotz
bbc64bf305
x|(x&y) -> x ( #7629 )
...
* x|(x&y) -> x
* fix tests
2024-11-11 10:00:18 +08:00
uuuvn
94a484542b
Hook memoryview via class instead of a function ( #7627 )
2024-11-11 09:07:06 +08:00
qazal
a8da84cce0
recursive swizzle with just graph_rewrite [pr] ( #7626 )
2024-11-10 20:14:21 +02:00
qazal
7275cfb9d8
cleanup swizzle upats [pr] ( #7624 )
2024-11-10 17:05:27 +02:00
qazal
092a441748
test swizzle post permute ( #7623 )
...
* test swizzle post permute
* add st_fixup assert
2024-11-10 16:18:22 +02:00
George Hotz
745316493c
hotfix: add test_simple_conv2d_bias
2024-11-10 18:36:42 +08:00
George Hotz
44c1fd5661
add optional llvm opt [pr] ( #7619 )
2024-11-10 13:26:49 +08:00
George Hotz
0a411b4f68
replace llvm with new llvm ( #7616 )
...
* replace llvm with new llvm
* fix test_linearizer
* minor fixups
* fix alloca
* don't use alloca
* fix DEFINE_ACC
* lines
* comments and lines
* a little tighter
2024-11-10 11:28:52 +08:00
qazal
b61266eb97
late fusion spec for big graph [pr] ( #7613 )
2024-11-09 23:43:11 +08:00
qazal
9d6b03d691
early assert swizzle in kernel [pr] ( #7610 )
...
* early assert swizzle in kernel [pr]
* better
* note changes
* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
8ca422e21a
script to compare kernel opt with BEAM ( #7604 )
...
intersting that on m1 max hcopt wins BEAM 2 about 20% of the time
2024-11-08 17:40:28 -05:00
chenyu
573f145dcf
METAL raise RuntimeError with no compiler and bad src ( #7603 )
...
fixed BEAM if src is invalid on METAL. it currently only accept RuntimeError in `_time_program`
2024-11-08 17:09:12 -05:00
chenyu
74b4d1c1e1
rewrite idx again in real_strides after uop_given_valid ( #7600 )
...
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1
simplify_valid in real_strides ( #7599 )
...
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
George Hotz
d8691a4f03
lil touchups ( #7597 )
2024-11-08 22:31:43 +08:00
Ahmed Harmouche
e35226e698
Remove Ops.ALU ( #7595 )
2024-11-08 19:52:14 +08:00
Harald Schäfer
e7cbc29f48
openpilot benchmark: add cast from numpy to benchmark ( #7593 )
...
* openpilot benchmark: add cast from numpy to benchmark
* whitespace
* comment
2024-11-08 19:31:00 +08:00
Ahmed Harmouche
d4e91b0de7
num_batches_tracked long only if supported [pr] ( #7582 )
2024-11-08 19:28:21 +08:00
chenyu
a1dfd288bb
different valid order ( #7589 )
...
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
dc7b0e2bb7
call VACUUM in diskcache_clear ( #7588 )
...
reclaims the db space after DROP TABLE
2024-11-07 16:55:48 -05:00
chenyu
4378b100ad
make UOp.range arg a tuple [pr] ( #7583 )
...
* make UOp.range arg a tuple [pr]
so render works on output of ShapeTracker.to_indexed_uops
* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be
uop_given_valid in real_strides ( #7231 )
...
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
uuuvn
c846dd70b2
Increase test tolerance for probabilistic test ( #7580 )
2024-11-07 09:35:11 -05:00
George Hotz
205befa788
move is_dtype_supported to device [pr] ( #7575 )
2024-11-07 20:38:03 +08:00
qazal
1f5ea1e412
late fusion tests, early merge view GroupOp.Buffer [pr] ( #7577 )
...
* test_late_fusion_double_transpose
* early merge view buffer ops
2024-11-07 20:04:57 +08:00
qazal
f0fc34e594
swizzle tests from the delete_fuse branch [pr] ( #7576 )
...
* swizzle tests from the delete branch [pr]
* actually test torch
* atol
2024-11-07 18:29:06 +08:00
chenyu
a011562450
fix view add with symbolic shape ( #7569 )
...
the issue is that the symbolic shape is not greedily simplified and canonicalized before reshape
2024-11-06 11:39:20 -05:00
qazal
fbd7d16e9e
create realizes later [pr] ( #7571 )
2024-11-07 00:24:07 +08:00
qazal
6a19ca81c9
failing test for View.__add__ RecursionError ( #7567 )
...
* failing test for View.__add__ RecursionError
* move to test_symbolic_shapetracker
2024-11-06 23:46:47 +08:00
chenyu
348d37df46
a few more unused type ignore [pr] ( #7568 )
2024-11-06 10:17:19 -05:00
qazal
37172b3f39
delete type: ignore that shouldn't exist [pr] ( #7566 )
2024-11-06 10:04:35 -05:00
qazal
a9a040398c
don't print the entire schedule on assert [pr] ( #7565 )
...
* don't print the entire schedule on assert [pr]
* extra
2024-11-06 18:29:50 +08:00
Anthony DeMattos
953ef1b57e
tinychat ui +/- 20 lines ( #7471 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-11-06 14:23:55 +08:00
chenyu
e7b18cf5c0
fix load_worlds filter_novariable ( #7564 )
...
filter based on "DEFINE_VAR" instead of "Variable". also added a unit test to make sure dataset includes image and variable kernels
2024-11-05 16:06:39 -05:00
chenyu
c805e3fff5
skip test_jit_batch_split if JIT >= 2 ( #7561 )
...
* skip test_jit_batch_split if JIT >= 2
only test graphs
* 1600
2024-11-05 14:59:04 -05:00
chenyu
f2fa183651
increase threshold test_strongly_connected_DAG ( #7560 )
...
it shoult test some other properties. flakying with time test https://github.com/chenyuxyz/tinygrad/actions/runs/11688403523/job/32548762512
2024-11-05 11:44:39 -05:00
Carl Basho
630a7f37cf
update tests ( #7554 )
...
Co-authored-by: John Doe <null@mail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-11-05 11:35:15 -05:00
chenyu
207bca6cea
set PAGE_SIZE=1 and generate new dataset ( #7559 )
...
13080 rows in total. both generating and loading this are pretty broken now. filters are wrong for example
2024-11-05 11:25:01 -05:00