chenyu
9fb396f660
test_ops maxpool2d -> max_pool2d ( #7696 )
...
and avgpool2d -> avg_pool2d for better grepping the tests
2024-11-14 10:39:12 -05:00
geohotstan
f8056a74d6
combine pad2d with pad ( #7677 )
...
* I have pad2d, I have pad, uuh~, pad2dpad~
* fix some small things
* strategically placed cast hack
* fix more
* fix more more
* tests
* periods
2024-11-14 17:56:02 +08:00
qazal
0914c2fec9
add TestLinearizerFailures test_failure_56 and test_failure_57 ( #7682 )
...
* add test_failure_56 and test_failure_57
* so it's only METAL=1
2024-11-14 12:00:33 +08:00
chenyu
333f5f9f8b
Tensor.bitwise_not ( #7688 )
...
implemented with xor in tensor for now to not add another op. also used it in Tensor.min to fix dtype int on -2**31
2024-11-13 16:31:52 -05:00
chenyu
fb933b79a6
add test case for nll_loss with input > 2D ( #7685 )
...
* failed test case for nll_loss with input > 2D
* fixed
* add more
2024-11-13 14:34:07 -05:00
geohotstan
9c41c376d3
add Tensor.nll_loss ( #7683 )
...
* move nll_loss to new branch
* make nll_loss examples practical
* self *is*
* add to docs
* small
2024-11-13 13:12:13 -05:00
chenyu
3c6fe4b79a
fix Tensor.bitwise_and and Tensor.bitwise_or to support bool ( #7684 )
2024-11-13 13:10:39 -05:00
chenyu
3d82f8e340
simpler rand_like ( #7680 )
2024-11-13 12:28:41 -05:00
James
d4e4a084a1
fix: Tensor min function for unsigned ints ( #7675 )
...
* add failing tests for uint8 `min()`
* fix unsigned data type min()
* fix test data
* fix whitespace
---------
Co-authored-by: rezaarezvan <reza@rezvan.xyz >
Co-authored-by: Jamesb <experimentallearning0@gmail.com >
2024-11-13 11:04:27 -05:00
chenyu
d1dfd598a2
assert specifying device to rand_like a multi tensor ( #7678 )
...
* assert specifying device to rand_like a multi tensor
raise RuntimeError instead of dropping it silently
* fix that
2024-11-13 10:24:40 -05:00
chenyu
51432bfbff
add rand_like test case with device specified ( #7663 )
...
in single device or copied multi case, device is applied. but for sharded case the device is silently ignored now. maybe similar to rand we just don't allow tuple device in rand_like
2024-11-13 09:32:55 -05:00
Reza Rezvan
23363dee55
Add: failing tests for uint8 min() ( #7669 )
...
* add failing tests for uint8 `min()`
* mark as expected failure
2024-11-13 22:12:53 +08:00
qazal
e84d089ef1
delete ReduceOps, only use REDUCE_AXIS ( #7667 )
2024-11-13 19:04:27 +08:00
chenyu
1884f021e3
add conv3x3 to speed_v_theoretical ( #7658 )
...
* add conv3x3 to speed_v_theoretical
* show test duration
2024-11-12 16:41:56 -05:00
chenyu
962dafb467
use randn in speed_v_theoretical instead of rand ( #7656 )
...
* use randn in speed_v_theoretical instead of rand
this made green gemv 20% faster... but why?
* update threshold
2024-11-12 15:00:32 -05:00
chenyu
6159790ab8
add gemv to speed_v_theoretical ( #7654 )
...
* add gemv to speed_v_theoretical
getting ~300GB/s if we just count the memory of inputs and output
* better green numbers
* flip
2024-11-12 11:19:35 -05:00
George Hotz
4f1f823021
add tiny test for randomness + remove ulong buffers ( #7648 )
...
* add tiny test for randomness
* Tensor._device_seeds is a Tuple
* no tuple, just a 2 element tensor
* no more longs
* fix tests, and maybe ocelot works now
* NV still doesn't work. cleanup rules
* test + two more rules
2024-11-12 12:45:52 +08:00
chenyu
c06a5a9c72
Tensor.linspace raises for dtype.bool ( #7649 )
...
also fixed an assert when passing str dtype to randint
2024-11-11 23:05:14 -05:00
geohotstan
5eef59d732
add Tensor.linspace ( #7609 )
...
* add linspace
* shave off tests and forgot to add to docs crap
* WHOOPS
* better tests
2024-11-12 10:29:36 +08:00
chenyu
99f29e50b2
update speed_v_theoretical numbers ( #7647 )
...
better amd after set compute profile
2024-11-11 20:05:13 -05:00
chenyu
773d5b60bf
beam benchmark tests ( #7638 )
...
* beam benchmark tests
* lower AMD number somehow
* less flaky
2024-11-11 18:11:18 -05:00
chenyu
bfab03288d
fix HALF=1 in test_speed_v_torch ( #7642 )
...
* fix HALF=1 in test_speed_v_torch
"operation cache defeats" adds 1 to all arg, which were centered around 0. adding 1 makes big matmul and matvec go inf.
fixed by subtract 1 after and bumpped tolerance for half input
* bigger tol for BIG=2, update CI too
* bigger tol
2024-11-11 14:29:37 -05:00
nimlgen
4d81b7952a
qcom match texture/sampler descriptors to OpenCL ( #7622 )
...
* qcom ioctl compare more regs
* bug fix
2024-11-11 21:56:51 +03:00
George Hotz
d40673505f
new cloud is cloudy [pr] ( #7631 )
...
* new cloud is cloudy [pr]
* waste lines to add security
* safety, with speed and less lines
* timing and del
* lines
* cleanups
* restore CloudSession
* bump to 3.10
* quotes
* renderer security
2024-11-11 20:18:04 +08:00
George Hotz
bbc64bf305
x|(x&y) -> x ( #7629 )
...
* x|(x&y) -> x
* fix tests
2024-11-11 10:00:18 +08:00
uuuvn
94a484542b
Hook memoryview via class instead of a function ( #7627 )
2024-11-11 09:07:06 +08:00
qazal
a8da84cce0
recursive swizzle with just graph_rewrite [pr] ( #7626 )
2024-11-10 20:14:21 +02:00
qazal
092a441748
test swizzle post permute ( #7623 )
...
* test swizzle post permute
* add st_fixup assert
2024-11-10 16:18:22 +02:00
George Hotz
745316493c
hotfix: add test_simple_conv2d_bias
2024-11-10 18:36:42 +08:00
George Hotz
0a411b4f68
replace llvm with new llvm ( #7616 )
...
* replace llvm with new llvm
* fix test_linearizer
* minor fixups
* fix alloca
* don't use alloca
* fix DEFINE_ACC
* lines
* comments and lines
* a little tighter
2024-11-10 11:28:52 +08:00
qazal
b61266eb97
late fusion spec for big graph [pr] ( #7613 )
2024-11-09 23:43:11 +08:00
qazal
9d6b03d691
early assert swizzle in kernel [pr] ( #7610 )
...
* early assert swizzle in kernel [pr]
* better
* note changes
* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
8ca422e21a
script to compare kernel opt with BEAM ( #7604 )
...
intersting that on m1 max hcopt wins BEAM 2 about 20% of the time
2024-11-08 17:40:28 -05:00
chenyu
573f145dcf
METAL raise RuntimeError with no compiler and bad src ( #7603 )
...
fixed BEAM if src is invalid on METAL. it currently only accept RuntimeError in `_time_program`
2024-11-08 17:09:12 -05:00
chenyu
74b4d1c1e1
rewrite idx again in real_strides after uop_given_valid ( #7600 )
...
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1
simplify_valid in real_strides ( #7599 )
...
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
Ahmed Harmouche
e35226e698
Remove Ops.ALU ( #7595 )
2024-11-08 19:52:14 +08:00
Harald Schäfer
e7cbc29f48
openpilot benchmark: add cast from numpy to benchmark ( #7593 )
...
* openpilot benchmark: add cast from numpy to benchmark
* whitespace
* comment
2024-11-08 19:31:00 +08:00
chenyu
a1dfd288bb
different valid order ( #7589 )
...
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
4378b100ad
make UOp.range arg a tuple [pr] ( #7583 )
...
* make UOp.range arg a tuple [pr]
so render works on output of ShapeTracker.to_indexed_uops
* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be
uop_given_valid in real_strides ( #7231 )
...
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
uuuvn
c846dd70b2
Increase test tolerance for probabilistic test ( #7580 )
2024-11-07 09:35:11 -05:00
George Hotz
205befa788
move is_dtype_supported to device [pr] ( #7575 )
2024-11-07 20:38:03 +08:00
qazal
1f5ea1e412
late fusion tests, early merge view GroupOp.Buffer [pr] ( #7577 )
...
* test_late_fusion_double_transpose
* early merge view buffer ops
2024-11-07 20:04:57 +08:00
qazal
f0fc34e594
swizzle tests from the delete_fuse branch [pr] ( #7576 )
...
* swizzle tests from the delete branch [pr]
* actually test torch
* atol
2024-11-07 18:29:06 +08:00
chenyu
a011562450
fix view add with symbolic shape ( #7569 )
...
the issue is that the symbolic shape is not greedily simplified and canonicalized before reshape
2024-11-06 11:39:20 -05:00
qazal
6a19ca81c9
failing test for View.__add__ RecursionError ( #7567 )
...
* failing test for View.__add__ RecursionError
* move to test_symbolic_shapetracker
2024-11-06 23:46:47 +08:00
qazal
a9a040398c
don't print the entire schedule on assert [pr] ( #7565 )
...
* don't print the entire schedule on assert [pr]
* extra
2024-11-06 18:29:50 +08:00
chenyu
c805e3fff5
skip test_jit_batch_split if JIT >= 2 ( #7561 )
...
* skip test_jit_batch_split if JIT >= 2
only test graphs
* 1600
2024-11-05 14:59:04 -05:00
chenyu
f2fa183651
increase threshold test_strongly_connected_DAG ( #7560 )
...
it shoult test some other properties. flakying with time test https://github.com/chenyuxyz/tinygrad/actions/runs/11688403523/job/32548762512
2024-11-05 11:44:39 -05:00