George Hotz
a5a55ac19e
GlobalCounters cache + assign in optim
2023-02-08 17:10:55 -06:00
George Hotz
3d63934995
refactor to keep cl in the runtime ( #545 )
...
* refactor to keep cl in the runtime
* fix thneed, rename cl to _cl
* bugfix + _cuda
* fix tests
* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
2844482a60
Mypy fun ( #541 )
...
* mypy fun
* things are just faster
* running fast
* mypy is fast
* compile.sh
* no gpu hack
* refactor ops_cpu and ops_torch to not subclass
* make weak buffer work
* tensor works
* fix test failing
* cpu/torch cleanups
* no or operator on dict in python 3.8
* that was junk
* fix warnings
* comment and touchup
2023-02-08 09:56:51 -06:00
James Roberts
db0a9b0a2d
Refactor CL.time_sum into GlobalCounters ( #519 )
2023-02-01 20:13:56 -08:00
Jacky Lee
799b3f185a
Refactor getenv into helpers ( #508 )
...
* Refactor getenv into helpers
* Remove unused os
* Fix default value
* Fix more defaults for CI
* Fix bracket
* Revert changes to openpilot/compile.py
* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
2db272c7f7
Kernel Optimizer ( #489 )
...
* kernel optimizer
* 10x faster, but wrong. not good deal
* move test -> extra
* print x speedup
* clcache
* fix clcache + DEBUG
* GFLOPS estimate
* i==3
2023-01-29 17:15:00 -08:00
George Hotz
66da3bc3c0
reset the benchmark timer
2023-01-25 09:20:34 -08:00
George Hotz
a0d169eb59
fix efficientnet
2022-09-28 14:23:01 -07:00
George Hotz
b132de677d
tinygrad.nn ( #367 )
...
* tinygrad.nn
* flake8
* working on pylint
* more pylint
* more pylint
* pylint passes
* networkx
* mypy can't infer that type
* junk
2022-08-18 07:41:00 -07:00
George Hotz
acbeaf0ba9
adam in benchmark_train_efficientnet
2022-07-19 09:33:07 -07:00
George Hotz
d985217fa4
skip reduce noops
2022-07-16 07:47:43 -07:00
George Hotz
5e46561f7e
no_grad = NOT backward
2022-07-10 20:54:57 -07:00
George Hotz
d5d9cffe7c
training param for batchnorm
2022-07-04 13:28:03 -07:00
George Hotz
34f43ea10e
LAZY and CLCACHE are defaults
2022-07-04 13:09:15 -07:00
George Hotz
b7afd83267
track cl mem used
2022-07-04 12:19:00 -07:00
George Hotz
d5de8452c6
dashed loadops
2022-07-04 09:50:56 -07:00
George Hotz
7276f8d6bf
improve constant folding, detach before moving tensor
2022-07-02 15:29:40 -07:00
George Hotz
0cb99d72e9
NUM=-1 is a small efficientnet for small people
2022-07-02 15:11:51 -07:00
George Hotz
8cf1aed0f4
don't track_running_stats, parameters must require_grad
2022-07-02 14:38:45 -07:00
George Hotz
f607f18006
fix backward
2022-06-25 00:00:53 -07:00
George Hotz
ec30f0402f
improve benchmark_train_efficientnet
2022-06-24 23:46:38 -07:00
George Hotz
d748353ce5
err, okay, a bit more off
2022-06-24 22:44:57 -07:00
George Hotz
bdde95f16e
CACHE_LAZYBUFFERS options + benchmark. only a couple x from torch
2022-06-24 22:33:53 -07:00