Commit Graph

7979 Commits

Author SHA1 Message Date
George Hotz
34f43ea10e LAZY and CLCACHE are defaults 2022-07-04 13:09:15 -07:00
George Hotz
425b0dcd58 sorry linecount, CLCACHE 2022-07-04 12:52:04 -07:00
George Hotz
b7afd83267 track cl mem used 2022-07-04 12:19:00 -07:00
George Hotz
5ef62c33a1 SHUFFLE_MOVEMENT_OPS is OPT=3 2022-07-04 09:55:30 -07:00
George Hotz
d5de8452c6 dashed loadops 2022-07-04 09:50:56 -07:00
George Hotz
e74adcce5c refactoring 2022-07-04 09:25:19 -07:00
George Hotz
0bdb021880 separate realize functions for different ops 2022-07-04 09:07:22 -07:00
George Hotz
81b73f97a3 Optiimzation (#355)
* constant folding into kernels

* that opt worth it?

* fix mypy

* ast one kernel

* save 2 lines in conv kernel

* debug print kernel count

* cl debugging

* early realize inputs

* refactor Device
2022-07-04 08:58:57 -07:00
George Hotz
df7976248b be lazy with the gpubuffer copies for host for constant folding 2022-07-03 23:04:14 -07:00
George Hotz
4d4ea47ca7 one more line 2022-07-03 17:28:42 -07:00
George Hotz
02cd8510cb cleanups 2022-07-03 17:23:20 -07:00
George Hotz
d89542640a hmm, typechecker isn't checking everything 2022-07-03 17:12:51 -07:00
George Hotz
6b0aa2a902 sorry about the line count, this is a good optimization 2022-07-03 17:11:13 -07:00
George Hotz
748618530b tests will run at okay speed now? 2022-07-03 16:41:52 -07:00
George Hotz
c3d13893f9 add SHUFFLE_MOVEMENT_OPS, exactly 1000 lines 2022-07-03 16:30:42 -07:00
George Hotz
e6e43e820e should fix tests 2022-07-03 16:06:11 -07:00
George Hotz
71a812fbf2 elementwise_ops 2022-07-03 15:29:38 -07:00
George Hotz
d7aad46758 test lazy also, make TestMNIST faster 2022-07-03 15:19:19 -07:00
Nicklas Boman
64d986bc8b add mypy to ci testing (#353) 2022-07-03 15:11:35 -07:00
George Hotz
57ebce8d67 first LazyBuffer optimizations 2022-07-03 15:09:16 -07:00
George Hotz
a1a20891ef more types 2022-07-03 14:03:34 -07:00
George Hotz
99b287ed87 typechecks 2022-07-03 13:54:30 -07:00
George Hotz
cdf2be74f9 add neg 2022-07-03 13:04:58 -07:00
George Hotz
72a9ff7011 remove numpy usage 2022-07-03 12:58:51 -07:00
George Hotz
745e36fda5 mlops cleanup 2022-07-03 12:41:05 -07:00
George Hotz
93c378dffc add test for slice_one 2022-07-03 12:14:20 -07:00
George Hotz
d10dd175f4 fix len 0 shapes in getitem 2022-07-03 12:12:02 -07:00
George Hotz
1b1c82fac7 print underlying buffer if it's realized 2022-07-03 11:52:58 -07:00
George Hotz
df16b455a7 make lazy the default (#352)
* make lazy the default

* always float32

* while the lazy framework should be default, lazyness itself shouldn't be (for now)

* bugfixes

* remove the need for the ops class

* fxn_for_op

* hmm, my contiguous asserts went away

* move small shape thing

* refactor reduce

* remove the weird unused new functions

* only that install works

* thats broken

* unused imports, should be good if it passes
2022-07-03 11:40:27 -07:00
George Hotz
bbfdd28a6d flops counter was dumb 2022-07-03 07:41:52 -07:00
George Hotz
c7a580daa9 flip div order 2022-07-02 23:37:22 -07:00
George Hotz
0d82cfd587 huh, torch 1.12 broke it. remove unused requirements.txt and pin torch 1.11 2022-07-02 23:07:59 -07:00
George Hotz
e822aae9ec reorg opts, nicer graph 2022-07-02 22:29:09 -07:00
George Hotz
f9a8412b68 make contiguous ops yellow 2022-07-02 17:54:04 -07:00
George Hotz
207b9e1df3 padding is now a param to conv2d 2022-07-02 17:11:12 -07:00
George Hotz
cde137d163 simple shapetracker tests 2022-07-02 16:02:15 -07:00
George Hotz
368c0ce2f6 NUM=-2 for ants 2022-07-02 15:47:10 -07:00
George Hotz
7276f8d6bf improve constant folding, detach before moving tensor 2022-07-02 15:29:40 -07:00
George Hotz
0cb99d72e9 NUM=-1 is a small efficientnet for small people 2022-07-02 15:11:51 -07:00
George Hotz
8cf1aed0f4 don't track_running_stats, parameters must require_grad 2022-07-02 14:38:45 -07:00
George Hotz
07b438aa8b move that to resolve time 2022-07-02 14:26:13 -07:00
George Hotz
dbf4aa09db assert and tuple 2022-06-27 09:19:54 -07:00
George Hotz
37a6c0ef59 create with new ShapeTracker 2022-06-27 09:07:45 -07:00
George Hotz
e55a9833fb a little more readable 2022-06-27 08:54:04 -07:00
George Hotz
67ff6b52fd move padding to convs in enet 2022-06-26 23:14:31 -07:00
George Hotz
04f521a963 err, float32 2022-06-26 23:05:04 -07:00
George Hotz
8540d1f289 track membw 2022-06-26 23:03:53 -07:00
George Hotz
3a414d7f50 cleanup, add flops tracking 2022-06-26 22:43:39 -07:00
George Hotz
a699f7cb0b debug cleanups 2022-06-26 21:58:44 -07:00
George Hotz
15a16b98e6 remove get_root 2022-06-26 21:18:02 -07:00