George Hotz
d651caa864
fixup openpilot/compile.py
2022-07-11 13:59:09 -07:00
George Hotz
5e46561f7e
no_grad = NOT backward
2022-07-10 20:54:57 -07:00
George Hotz
b34ae7876f
lol chr(10) not chr(13)
2022-07-10 20:03:11 -07:00
George Hotz
817b64f5e5
A conv is a reduce op ( #356 )
...
* universal strided conv
* more correct
* hmm, CPU works
* cleaner cl code output
* make noconv a flag
* cleanup __getitem__
* refactor broadcasting
* put that back
* unneeded reshape in getitem
* fix strided for torch
2022-07-10 19:58:50 -07:00
George Hotz
057e4f5aa5
a little faster and cleaner
2022-07-09 08:14:01 -07:00
George Hotz
c77ba7fa3f
ops_cpu readibility
2022-07-09 07:48:54 -07:00
George Hotz
0a36475700
no einsum for now
2022-07-09 00:04:40 -07:00
George Hotz
c39a245696
that's not where i thought we'd lose lines...
2022-07-08 23:52:38 -07:00
George Hotz
75e1848b09
always SHUFFLE_RESHAPE_OPS
2022-07-08 23:19:39 -07:00
George Hotz
44848ee5dc
prints show we can precompute from the outside
2022-07-08 10:59:20 -07:00
George Hotz
68959be05d
precompute weights for opencl
2022-07-08 10:56:48 -07:00
George Hotz
d8e7f1f8bc
opencl type ignore
2022-07-08 10:33:55 -07:00
George Hotz
ae335b6d3e
opencl works, but tons of kernels
2022-07-08 10:22:04 -07:00
George Hotz
5b66d1bb0b
begin fixing up opencl
2022-07-08 10:20:14 -07:00
George Hotz
7e17f2ae8d
fix mypy, add TODOs
2022-07-08 09:57:22 -07:00
George Hotz
8557ed88df
use ast engine for merged reduceop
2022-07-08 09:37:40 -07:00
George Hotz
3656a5615a
MERGE_ELEMENTWISE_INTO_REDUCE
2022-07-08 09:32:28 -07:00
George Hotz
ca9532ce29
less lines, and typing found a bug
2022-07-08 08:57:12 -07:00
George Hotz
2035b89e54
wooo 1k lines
2022-07-08 08:44:57 -07:00
George Hotz
2a8c1071d9
cleanups
2022-07-08 08:36:31 -07:00
George Hotz
e6733286df
unify conv and reduce
2022-07-08 08:27:30 -07:00
George Hotz
9c34b3eef3
tighten up gpu kernels
2022-07-08 07:59:04 -07:00
George Hotz
563bf2d8e8
force input/weight to be contiguous (uncached)
2022-07-08 07:40:30 -07:00
George Hotz
1cf805a56a
fix no MERGE_MOVEMENT_OPS bug
2022-07-08 07:27:53 -07:00
George Hotz
c0ef998b48
remove finished todo
2022-07-07 11:39:25 -07:00
George Hotz
715e335c60
fix types
2022-07-07 11:36:09 -07:00
George Hotz
9ee8426c51
much better cache
2022-07-07 11:32:00 -07:00
George Hotz
eb6696c3a5
only childless elementwise ops get merged
2022-07-07 11:13:25 -07:00
George Hotz
04e7e4104c
track graph children and make lazycache use weak references
2022-07-07 11:01:18 -07:00
George Hotz
001cfe83a2
local
2022-07-07 10:05:26 -07:00
George Hotz
2720ef49ca
extra and test and tuple
2022-07-07 10:01:33 -07:00
George Hotz
059fe94700
junk import
2022-07-06 21:47:38 -07:00
George Hotz
a61a4d09ad
merge conv and binary op
2022-07-06 08:27:26 -07:00
George Hotz
6e0015095f
LBCACHE
2022-07-04 16:05:19 -07:00
George Hotz
7a5acd3ace
cache
2022-07-04 16:04:48 -07:00
George Hotz
d5d9cffe7c
training param for batchnorm
2022-07-04 13:28:03 -07:00
George Hotz
21c78b9316
can be v slow
2022-07-04 13:23:34 -07:00
George Hotz
46bce4156f
CL profiling
2022-07-04 13:22:12 -07:00
George Hotz
34f43ea10e
LAZY and CLCACHE are defaults
2022-07-04 13:09:15 -07:00
George Hotz
425b0dcd58
sorry linecount, CLCACHE
2022-07-04 12:52:04 -07:00
George Hotz
b7afd83267
track cl mem used
2022-07-04 12:19:00 -07:00
George Hotz
5ef62c33a1
SHUFFLE_MOVEMENT_OPS is OPT=3
2022-07-04 09:55:30 -07:00
George Hotz
d5de8452c6
dashed loadops
2022-07-04 09:50:56 -07:00
George Hotz
e74adcce5c
refactoring
2022-07-04 09:25:19 -07:00
George Hotz
0bdb021880
separate realize functions for different ops
2022-07-04 09:07:22 -07:00
George Hotz
81b73f97a3
Optiimzation ( #355 )
...
* constant folding into kernels
* that opt worth it?
* fix mypy
* ast one kernel
* save 2 lines in conv kernel
* debug print kernel count
* cl debugging
* early realize inputs
* refactor Device
2022-07-04 08:58:57 -07:00
George Hotz
df7976248b
be lazy with the gpubuffer copies for host for constant folding
2022-07-03 23:04:14 -07:00
George Hotz
4d4ea47ca7
one more line
2022-07-03 17:28:42 -07:00
George Hotz
02cd8510cb
cleanups
2022-07-03 17:23:20 -07:00
George Hotz
d89542640a
hmm, typechecker isn't checking everything
2022-07-03 17:12:51 -07:00