George Hotz
|
68959be05d
|
precompute weights for opencl
|
2022-07-08 10:56:48 -07:00 |
|
George Hotz
|
d8e7f1f8bc
|
opencl type ignore
|
2022-07-08 10:33:55 -07:00 |
|
George Hotz
|
ae335b6d3e
|
opencl works, but tons of kernels
|
2022-07-08 10:22:04 -07:00 |
|
George Hotz
|
5b66d1bb0b
|
begin fixing up opencl
|
2022-07-08 10:20:14 -07:00 |
|
George Hotz
|
7e17f2ae8d
|
fix mypy, add TODOs
|
2022-07-08 09:57:22 -07:00 |
|
George Hotz
|
8557ed88df
|
use ast engine for merged reduceop
|
2022-07-08 09:37:40 -07:00 |
|
George Hotz
|
3656a5615a
|
MERGE_ELEMENTWISE_INTO_REDUCE
|
2022-07-08 09:32:28 -07:00 |
|
George Hotz
|
ca9532ce29
|
less lines, and typing found a bug
|
2022-07-08 08:57:12 -07:00 |
|
George Hotz
|
2035b89e54
|
wooo 1k lines
|
2022-07-08 08:44:57 -07:00 |
|
George Hotz
|
2a8c1071d9
|
cleanups
|
2022-07-08 08:36:31 -07:00 |
|
George Hotz
|
e6733286df
|
unify conv and reduce
|
2022-07-08 08:27:30 -07:00 |
|
George Hotz
|
9c34b3eef3
|
tighten up gpu kernels
|
2022-07-08 07:59:04 -07:00 |
|
George Hotz
|
563bf2d8e8
|
force input/weight to be contiguous (uncached)
|
2022-07-08 07:40:30 -07:00 |
|
George Hotz
|
1cf805a56a
|
fix no MERGE_MOVEMENT_OPS bug
|
2022-07-08 07:27:53 -07:00 |
|
George Hotz
|
c0ef998b48
|
remove finished todo
|
2022-07-07 11:39:25 -07:00 |
|
George Hotz
|
715e335c60
|
fix types
|
2022-07-07 11:36:09 -07:00 |
|
George Hotz
|
9ee8426c51
|
much better cache
|
2022-07-07 11:32:00 -07:00 |
|
George Hotz
|
eb6696c3a5
|
only childless elementwise ops get merged
|
2022-07-07 11:13:25 -07:00 |
|
George Hotz
|
04e7e4104c
|
track graph children and make lazycache use weak references
|
2022-07-07 11:01:18 -07:00 |
|
George Hotz
|
001cfe83a2
|
local
|
2022-07-07 10:05:26 -07:00 |
|
George Hotz
|
2720ef49ca
|
extra and test and tuple
|
2022-07-07 10:01:33 -07:00 |
|
George Hotz
|
059fe94700
|
junk import
|
2022-07-06 21:47:38 -07:00 |
|
George Hotz
|
a61a4d09ad
|
merge conv and binary op
|
2022-07-06 08:27:26 -07:00 |
|
George Hotz
|
6e0015095f
|
LBCACHE
|
2022-07-04 16:05:19 -07:00 |
|
George Hotz
|
7a5acd3ace
|
cache
|
2022-07-04 16:04:48 -07:00 |
|
George Hotz
|
d5d9cffe7c
|
training param for batchnorm
|
2022-07-04 13:28:03 -07:00 |
|
George Hotz
|
21c78b9316
|
can be v slow
|
2022-07-04 13:23:34 -07:00 |
|
George Hotz
|
46bce4156f
|
CL profiling
|
2022-07-04 13:22:12 -07:00 |
|
George Hotz
|
34f43ea10e
|
LAZY and CLCACHE are defaults
|
2022-07-04 13:09:15 -07:00 |
|
George Hotz
|
425b0dcd58
|
sorry linecount, CLCACHE
|
2022-07-04 12:52:04 -07:00 |
|
George Hotz
|
b7afd83267
|
track cl mem used
|
2022-07-04 12:19:00 -07:00 |
|
George Hotz
|
5ef62c33a1
|
SHUFFLE_MOVEMENT_OPS is OPT=3
|
2022-07-04 09:55:30 -07:00 |
|
George Hotz
|
d5de8452c6
|
dashed loadops
|
2022-07-04 09:50:56 -07:00 |
|
George Hotz
|
e74adcce5c
|
refactoring
|
2022-07-04 09:25:19 -07:00 |
|
George Hotz
|
0bdb021880
|
separate realize functions for different ops
|
2022-07-04 09:07:22 -07:00 |
|
George Hotz
|
81b73f97a3
|
Optiimzation (#355)
* constant folding into kernels
* that opt worth it?
* fix mypy
* ast one kernel
* save 2 lines in conv kernel
* debug print kernel count
* cl debugging
* early realize inputs
* refactor Device
|
2022-07-04 08:58:57 -07:00 |
|
George Hotz
|
df7976248b
|
be lazy with the gpubuffer copies for host for constant folding
|
2022-07-03 23:04:14 -07:00 |
|
George Hotz
|
4d4ea47ca7
|
one more line
|
2022-07-03 17:28:42 -07:00 |
|
George Hotz
|
02cd8510cb
|
cleanups
|
2022-07-03 17:23:20 -07:00 |
|
George Hotz
|
d89542640a
|
hmm, typechecker isn't checking everything
|
2022-07-03 17:12:51 -07:00 |
|
George Hotz
|
6b0aa2a902
|
sorry about the line count, this is a good optimization
|
2022-07-03 17:11:13 -07:00 |
|
George Hotz
|
748618530b
|
tests will run at okay speed now?
|
2022-07-03 16:41:52 -07:00 |
|
George Hotz
|
c3d13893f9
|
add SHUFFLE_MOVEMENT_OPS, exactly 1000 lines
|
2022-07-03 16:30:42 -07:00 |
|
George Hotz
|
e6e43e820e
|
should fix tests
|
2022-07-03 16:06:11 -07:00 |
|
George Hotz
|
71a812fbf2
|
elementwise_ops
|
2022-07-03 15:29:38 -07:00 |
|
George Hotz
|
d7aad46758
|
test lazy also, make TestMNIST faster
|
2022-07-03 15:19:19 -07:00 |
|
Nicklas Boman
|
64d986bc8b
|
add mypy to ci testing (#353)
|
2022-07-03 15:11:35 -07:00 |
|
George Hotz
|
57ebce8d67
|
first LazyBuffer optimizations
|
2022-07-03 15:09:16 -07:00 |
|
George Hotz
|
a1a20891ef
|
more types
|
2022-07-03 14:03:34 -07:00 |
|
George Hotz
|
99b287ed87
|
typechecks
|
2022-07-03 13:54:30 -07:00 |
|