George Hotz
001cc96e25
Lazy refactor ( #538 )
...
* refactor lazy to return ASTs
* a lil cleaner
* oops, compare ids
* gate on GRAPH
* cleanups
* less calls to log_op
* simpler
* realize_buffers -> map_buffers
* even simpler
* think in asts
* a lil cleaner
* NOOP means contiguous
2023-02-07 11:53:21 -06:00
George Hotz
02d8cb0959
lazy cleanup
2023-02-07 07:39:53 -06:00
George Hotz
d93563f39f
fix KOPT
2023-02-07 06:56:33 -06:00
Jared Z
7604b17fbf
TestZeroViewShapeTracker fix test ( #481 )
...
* TestZeroViewST test
* updated to align with st naming conventions in file
* Update test_shapetracker.py
2023-02-07 06:17:55 -06:00
George Hotz
c073271f20
more symbolic correctness
2023-02-07 00:03:14 -06:00
George Hotz
e961fd3a04
more symbolic test, ModNode is wrong
2023-02-06 23:43:21 -06:00
George Hotz
8cfeb118d6
symbolic new test
2023-02-06 23:27:26 -06:00
George Hotz
7c5a5ecdac
even simpler symbolic
2023-02-06 22:47:00 -06:00
George Hotz
8b05de1841
symbolic cleanups
2023-02-06 22:12:11 -06:00
George Hotz
2a924e2b77
fix sz.sh for llvm
2023-02-06 15:36:05 -06:00
James Roberts
0d405fd5bc
Parallelize CI tests ( #535 )
2023-02-06 15:27:44 -06:00
Andrey
4977d6f225
using tuples in isinstance ( #534 )
2023-02-06 14:40:26 -06:00
timmermansjoy
d56c57b112
adding more robust install method ( #532 )
2023-02-06 13:12:05 -06:00
George Hotz
fd3807c479
delete cherry and old cuda accel, promote llvm
2023-02-06 10:02:41 -06:00
George Hotz
90529d3750
tests are 20% faster ( #529 )
...
* pytorch CPU
* no cache, it's slower
* pytorch cpu for real
* remove double onnx
2023-02-06 09:56:14 -06:00
George Hotz
039de1b332
oops, pytest is for testing
2023-02-06 09:30:12 -06:00
George Hotz
6eb0e6a650
shuffle deps: always tqdm, make linting category
2023-02-06 09:27:01 -06:00
George Hotz
1d80639646
make linter test install testing deps
2023-02-06 09:21:48 -06:00
George Hotz
60bb64811c
merge mypy into linters, no useless package update
2023-02-06 09:14:00 -06:00
George Hotz
c3d81bba2a
test_train: Adam -> SGD
2023-02-06 08:55:41 -06:00
George Hotz
36c26a57b1
make slow LLVM opt optional
2023-02-05 20:24:12 -06:00
George Hotz
f7291f6ca3
fixes big KOPT, breaks opencl ( #505 )
...
* fixes big KOPT, breaks opencl
* fix optimizer
* KernelCache
* oops, broke batchnorm
* hack to fix it
* fix llvm, less hacky gpu
* disable the cache
* cache just breaks things
2023-02-05 10:46:17 -08:00
Martin Loretz
97f0a82be7
Cache pip packages in github actions ( #522 )
...
* Cache pip dependencies in github actions
* Add setup.py as cache-dependency-path
* Test caching
* Test caching
* Upgrade setup python action
* Test caching
* Remove setup.py from cache-dependency-path
* Don't remove cache-dependency-path
* Don't cache linter package's
* Test caching
* Test caching
* Test caching
* Upgrade actions/checkout to v3
2023-02-03 20:04:20 -08:00
Martin Loretz
4ad67b4bbc
Refactor triton buffer to use CLBuffer of cuda runtime ( #524 )
...
* Refactor triton buffer to use CLBuffer of runtime
* Fix opencl GT0
2023-02-03 20:02:41 -08:00
Jacky Lee
ad4f6aa2cf
Add test for quick_gelu ( #526 )
...
* Add test for quick_gelu
* Bump PyTorch version for approximate
2023-02-03 20:01:39 -08:00
James Roberts
db0a9b0a2d
Refactor CL.time_sum into GlobalCounters ( #519 )
2023-02-01 20:13:56 -08:00
Martin Loretz
45e847d284
Update triton to work in master ( #517 )
...
* Update triton to work in master
* Move mem_estimate out of runner
2023-02-01 12:58:14 -08:00
George Hotz
5e37f084db
stable diffusion: clean up constant folding
2023-02-01 12:53:16 -08:00
George Hotz
175c38d1b3
triton: it already was GT0
2023-02-01 12:00:33 -08:00
Jacky Lee
486f023e81
Rename Normalize and move to nn ( #513 )
...
* Rename Normalize and move to nn
* Match PyTorch for dim>1
2023-02-01 11:55:03 -08:00
George Hotz
cd97b036cc
A Triton backend for tinygrad ( #470 )
...
* triton can add
* print stuff from triton
* write out file
* ops triton working
* reduce ops
* sort of works
* Triton bugfixes & implementation of remaining ops (#490 )
* padding
* support pow, max, relu, gt0
* allocate return buffer
* Fix reduce
* Add tests for power op
* Fix triton illegal memory accesses and memory leak (#512 )
* Fix mypy issue
* Add triton to setup.py
* Replace torch with pycuda
* Use one cuda stream for data transfer and kernels
* Remove triton submodule
* Fix memory leak by using weakrefs for caching
* Fix memory access by adding valid as mask for load
* Fix invalid kernel launches by flattening the grid (#515 )
---------
Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com >
2023-02-01 11:53:57 -08:00
George Hotz
4e24002bbe
no generic exceptions
2023-02-01 11:14:37 -08:00
Jacky Lee
54c68defc7
Replace SIGN with GT0 ( #511 )
...
* Replace sign with gt0
* Replace sign with gt0
* GT0 works on GPU
* Fix brackets
---------
Co-authored-by: Tom Finet <tom.codeninja@gmail.com >
2023-02-01 11:01:39 -08:00
Jacky Lee
799b3f185a
Refactor getenv into helpers ( #508 )
...
* Refactor getenv into helpers
* Remove unused os
* Fix default value
* Fix more defaults for CI
* Fix bracket
* Revert changes to openpilot/compile.py
* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
d91b6711ea
oops, broke BN
2023-01-31 08:18:48 -08:00
George Hotz
21f2af08d5
getenv + graphing
2023-01-30 19:15:03 -08:00
Jacky Lee
491e78d203
Add symbolic tests for correctness ( #494 )
...
* [WIP] Add symbolic tests for correctness
* Fix typo
* Fix expected value for test_and_fold
* Add more tests for symbolic
* It is indeed right
* Clean up
* Check all strings
* Put TODO back
2023-01-30 18:40:16 -08:00
George Hotz
60ccddb58b
reenable SWAP
2023-01-30 17:32:02 -08:00
George Hotz
c1a769b68b
fix bug in gpu copy out
2023-01-30 16:51:28 -08:00
George Hotz
e87410c531
fix multiple accumulators
2023-01-30 16:22:26 -08:00
George Hotz
aea55eb196
found failing upcast
2023-01-30 16:12:56 -08:00
George Hotz
b67f997864
tests pass w/o float4
2023-01-30 15:40:49 -08:00
George Hotz
c6f570a2e6
improve progress bar
2023-01-30 14:50:28 -08:00
Kevin Gilpin
4685c9c095
Big changes ( #498 )
...
Use make_pair
2023-01-30 14:42:22 -08:00
George Hotz
7118602c97
goat progress bar
2023-01-30 14:37:26 -08:00
George Hotz
7ee0d99c70
CLCACHE
2023-01-30 14:02:06 -08:00
George Hotz
7457f0d755
KOPT=2
2023-01-30 13:28:06 -08:00
George Hotz
cccfea4b25
factor out KOPT code
2023-01-30 13:13:55 -08:00
George Hotz
de2c419fd4
make_pair and first attempt at hlb_cifar10
2023-01-30 11:07:23 -08:00
AllentDan
7b6b1f32b1
[Fix] fix typo: test_mnist -> datasets ( #492 )
...
* test_mnist -> datasets
* fix mnist_gan
2023-01-29 21:30:47 -08:00