George Hotz
999b44c274
fix external test + speed
2023-03-03 06:46:16 -08:00
George Hotz
8919ca8163
test cleanups
2023-03-03 06:36:06 -08:00
George Hotz
459488bba2
fix linter ( #630 )
...
* fix linter
* no imports okay
* explicit bases
* disable in pylintrc
2023-03-02 20:06:20 -08:00
George Hotz
3915c89fb6
symbolic improvements ( #629 )
...
* fixups
* shorter diff
* wow, okay removing that had side effects
* more numeric tests
* MIN MAX tests
2023-03-02 19:50:38 -08:00
George Hotz
b842cdf11f
support wait in cuda
2023-03-02 10:39:26 -08:00
George Hotz
dc88ad3342
fix ops print bug
2023-03-02 10:33:03 -08:00
George Hotz
0335cb86b9
refactor comparison. there's a bug in the method cache
2023-03-02 10:10:16 -08:00
George Hotz
11f257ddf9
clean up cmp tests
2023-03-02 09:35:16 -08:00
George Hotz
8902764167
fit nits in compare
2023-03-02 08:15:26 -08:00
Diogo
52204a7b88
adding comparison operators ( #616 )
...
* Less, LessOrEqual, Greater, GreaterOrEqual, Equal
* lint fix
* using built in functions
* overriding __eq__ breaks things
* backwards pass for less - foward only tests
* one other spot
* removing backwards for comparison ops to match pytorch
* raise runtime error
* more tests for comparison ops
* fixed the lineup
* added number upcast tests
2023-03-02 08:10:44 -08:00
George Hotz
2e26286294
speed like you wouldn't believe ( #626 )
...
* speed like you wouldn't believe
* fix tests
2023-03-02 07:49:19 -08:00
Martin Loretz
51fb6aeb45
Fix cuda runtime ( #625 )
2023-03-02 06:52:34 -08:00
George Hotz
fca055bd66
NOOP means contiguous
2023-03-01 21:54:51 -08:00
George Hotz
d062cc82b8
put restrict back
2023-03-01 21:34:45 -08:00
George Hotz
201d9a2d58
remove extra copy on output
2023-03-01 21:19:44 -08:00
George Hotz
b442e75c7a
test speed v torch
2023-03-01 19:50:12 -08:00
George Hotz
bfcec234a2
Refactor ASTs ( #622 )
...
* ugh worst branch name
* compiler refactor continues
* scc -> cloc
* buf -> _buf
* finish _buf, and program -> runtime
* gpu is still working, clang isn't
* clang in new style
* ops_metal
* something broke it
* improve metal
* clean up tons of cl crap
* hack fix sync
* cleaner gpu
* gpu metal clang
* cleanups
* minor refactor
* GPUCodegen
* fix up LLVM
* blind CUDA refactor
* codegen / runtime
* keep ops naming
* linter passes
* woah, llvm was allocing 4x what it needed to
* bugfixes
* fix openpilot compiler
* fix compile_efficientnet
* method cache should fix tests
* deal with duped functions
2023-03-01 18:57:29 -08:00
Jacky Lee
5e41d5857c
Add tests for randomness ( #621 )
...
* Add tests for random creation functions
* It worked on my machine!
* Rename to helper_same_distribution
* Remove extra line
* Add tests for equal distribution
* Test without scipy
* Do a different test for randn
2023-03-01 15:39:20 -08:00
George Hotz
0055f0c2b3
touchups
2023-02-28 20:36:11 -08:00
George Hotz
f4aa3868e3
remove save_for_backward, there's still lines to save in the simplest places
2023-02-28 20:11:05 -08:00
George Hotz
7ff92550bb
slice -> pad, shrink
2023-02-28 19:58:12 -08:00
George Hotz
ea3fa07c2a
bump tinygrad to 0.5, move reshape logic from mlops
2023-02-28 18:07:03 -08:00
George Hotz
e9e71fbfc4
remove mlop ( #619 )
...
* remove mlop
* lil simpler
2023-02-28 17:58:24 -08:00
George Hotz
6b423b675d
fix mulacc when both strides are 0
2023-02-28 17:27:52 -08:00
George Hotz
4c4d88aad4
fix the last bug, and make HLOP the default
2023-02-28 17:04:28 -08:00
George Hotz
fde6c2d62b
fix image grouping
2023-02-28 16:50:46 -08:00
George Hotz
17c55f051d
fix test symbolic
2023-02-28 16:37:08 -08:00
George Hotz
28f52f7c24
improve symbolic
2023-02-28 16:21:58 -08:00
George Hotz
1702a5779f
remove hacks from can_merge
2023-02-28 15:30:20 -08:00
George Hotz
e21df1701b
distribute + refactor merge_views
2023-02-28 14:57:56 -08:00
George Hotz
7e6edfbc64
unbreak onnx conv padding
2023-02-28 13:55:03 -08:00
George Hotz
7d556ca7e0
avg/max pool work in N-D
2023-02-28 13:38:27 -08:00
George Hotz
dcb50a3a9f
better hlop image conv
2023-02-28 13:06:58 -08:00
George Hotz
9d539b8ebb
more intuitive output shape from _pool
2023-02-28 11:41:48 -08:00
George Hotz
d722ffbd04
_pool2d -> _pool
2023-02-28 11:35:19 -08:00
George Hotz
3c8da6bd03
add typing
2023-02-28 10:54:46 -08:00
George Hotz
922f96e527
DeviceBuffer : shape can be correct type now
2023-02-28 10:08:55 -08:00
George Hotz
a8bbcccc16
debug print shapetrackers
2023-02-28 08:11:40 -08:00
George Hotz
cfa5a12f13
simplify in shapetracker
2023-02-28 00:35:26 -08:00
George Hotz
8478a61cdb
simplify in shapetracker
2023-02-28 00:35:26 -08:00
George Hotz
262f81d795
applegpu everywhere
2023-02-27 22:54:59 -08:00
George Hotz
d584bae5c0
fine, openpilot can have 197 kernels
2023-02-27 11:48:36 -08:00
George Hotz
7b999add1d
all onnx model tests pass
2023-02-27 11:22:45 -08:00
George Hotz
652d48ccec
onnx : openpilot expand issue was fixed yesterday. remove hack
2023-02-27 11:04:42 -08:00
George Hotz
9d6b63f043
add ConstantOfShape
2023-02-27 10:57:50 -08:00
George Hotz
082134952b
CastLike works with one type hack
2023-02-27 10:51:26 -08:00
Jacky Lee
1ffe8d68d5
Add more onnx ops ( #615 )
...
* Add Celu
* Add thresholded relu
* Add softsign
2023-02-27 10:43:41 -08:00
George Hotz
643e8b0388
fix tests, test bn evaluate too
2023-02-27 10:39:47 -08:00
George Hotz
2f17d151b3
fix batchnorm not realizing
2023-02-27 10:19:54 -08:00
George Hotz
c9252d38b2
mypy cache breaks if you sometimes check untyped defs, no checking tests for now
2023-02-27 09:57:33 -08:00