George Hotz
b1dec64815
new types and fixup ShapeTracker type mismatches
2023-01-25 19:39:36 -08:00
George Hotz
faab6461dd
that lambda is required
2023-01-25 18:46:56 -08:00
George Hotz
8db345d846
functools.partialmethod -> lambda fixes Python 3.11
2023-01-25 18:08:38 -08:00
George Hotz
90121482fa
oops, don't assign self
2023-01-09 18:02:12 -08:00
George Hotz
fad7cba590
move batchnorm to Tensor
2023-01-09 18:00:16 -08:00
Daniel Davis
64ff1ddc10
Reduce line count ( #424 )
...
* save a line, save a life
* save a line, save a life
* change order of tern
2022-11-09 10:07:22 -08:00
George Hotz
2cc1d970c6
updates from the chonker branch
2022-11-07 21:12:08 -08:00
George Hotz
df31dde174
hasattr and DeviceBuffer type fixups
2022-10-28 09:05:45 -07:00
George Hotz
10921a60c4
more imports from llvm branch
2022-10-26 18:02:36 -07:00
George Hotz
6a8fb53304
move ops.py into lazy.py ( #402 )
...
* move ops.py into lazy.py
* fix graph and linter
* ugh, didn't add
2022-10-25 13:58:03 -07:00
Drew Hintz
165fb4d631
remove redundant list comprehension from inside all. ( #397 )
...
remove explicit inherit from object.
2022-10-13 09:58:35 -07:00
George Hotz
dec5334da9
revert layernorm to have axis param
2022-09-26 10:11:38 -04:00
George Hotz
dc80bf6f85
layernorm is all axis but the first
2022-09-25 17:55:48 -04:00
George Hotz
60df954377
Fix weight init: this work? ( #391 )
...
* this work?
* glorot uniform
* requies_grad broke
* propagate the None correctly
* so this weight init works
* ahh, i think it's this
* can't beat this
* glorot is best for ae
* remove comments
2022-09-25 16:46:33 -04:00
George Hotz
271446e3eb
set requires_grad to None ( #387 )
...
* set requires_grad to None
* some things need gradients
* hmm, why was get_parameters filtering
2022-09-21 11:16:02 -04:00
George Hotz
a8aa1f9589
that's simpler
2022-09-18 20:40:46 -04:00
George Hotz
57e804a9bf
add min support
2022-09-18 20:39:41 -04:00
George Hotz
790af99a48
fix slice one multi, and linear can be simpler with new broadcasting
2022-09-06 19:51:33 -07:00
George Hotz
4f4ecbec97
add div to operators
2022-09-06 17:39:26 -07:00
YassineYousfi
5aad460c7a
broadcast from right to left ( #375 )
...
* broadcast from right to left
* add another broadcasted add test
2022-09-06 16:36:13 -07:00
George Hotz
f215534a64
1100 lines, but sane linter rules
2022-09-06 13:47:45 -07:00
George Hotz
2ed3bb6223
clip model is running
2022-09-05 11:26:32 -07:00
Ollin Boer Bohan
2c6f4e4c66
Make creation helpers use fp32 by default ( #374 )
...
* Make creation helpers use fp32 by default
half the big = twice the fast
* Fix flake8 with an extra multiply
2022-09-04 13:47:27 -07:00
George Hotz
9590d92750
stable diffusion compiles (add no_init)
2022-09-04 11:40:50 -07:00
George Hotz
172683c314
work
2022-09-04 11:21:09 -07:00
George Hotz
bcb867cdd6
better idea for numbers, do the division in python
2022-09-03 16:23:39 -07:00
George Hotz
39e1d23c88
from_number_like to fix div issue
2022-09-03 16:19:16 -07:00
George Hotz
c2a030fe55
one liner that's more clear
2022-09-03 16:08:48 -07:00
George Hotz
852de7c66c
remove ugly parens
2022-09-03 15:41:37 -07:00
George Hotz
4dadd95e3c
fix tests hopefully, more stable diffusion
2022-09-03 10:38:31 -07:00
Comma Device
a734df98fa
TEST_ENET for openpilot compiler
2022-08-31 13:23:36 -04:00
George Hotz
e194ae0c1d
typos
2022-08-30 19:52:21 -07:00
Mitchell Goff
3af650b028
Rewrote Tensor.cat to be shorter and (hopefully) clearer ( #372 )
...
* Rewrote Tensor.cat to be shorter and (hopefully) clearer
* Use cumsum[-1] instead of separate sum
2022-08-30 16:15:07 -07:00
George Hotz
5efab7cf1d
add reciprocal
2022-08-29 18:00:24 -07:00
kposborne2
ec5d9b355c
use functools.partialmethod ( #369 )
...
Co-authored-by: Kyle <kposborne@gmail.com >
2022-08-21 12:13:31 -07:00
George Hotz
7f15779942
t.assign in optim
2022-08-20 14:04:33 -07:00
George Hotz
b132de677d
tinygrad.nn ( #367 )
...
* tinygrad.nn
* flake8
* working on pylint
* more pylint
* more pylint
* pylint passes
* networkx
* mypy can't infer that type
* junk
2022-08-18 07:41:00 -07:00
George Hotz
46e7dfade1
REQUIRES_SIMPLE_REDUCE
2022-07-19 11:42:14 -07:00
George Hotz
5d45c6e516
Fold reduce ( #362 )
...
* folding reduce
* fold through movementops
* fixup shapes
* was too aggressive
* i knew we needed that
* don't recompute reduce
* working
* fix openpilot compile
* prunegraph openpilot
* types and reduce_shape
* refactor
* cleanups
* neater
* 1009
* 1004
* clean up reduce for 998
2022-07-19 09:24:02 -07:00
George Hotz
bcf422dfdd
Device2 ( #358 )
...
* option for matmul
* fixups
* fast like a nascar
* running
* thneed runner
* no buffer id makes no backing buffer
* move constant folding to the top
* runs on mac
* folded biases
* was v slow
* maybe just that
* elu touchup
* speed and float32
Co-authored-by: Comma Device <device@comma.ai >
2022-07-16 07:26:19 -07:00
George Hotz
817b64f5e5
A conv is a reduce op ( #356 )
...
* universal strided conv
* more correct
* hmm, CPU works
* cleaner cl code output
* make noconv a flag
* cleanup __getitem__
* refactor broadcasting
* put that back
* unneeded reshape in getitem
* fix strided for torch
2022-07-10 19:58:50 -07:00
George Hotz
ca9532ce29
less lines, and typing found a bug
2022-07-08 08:57:12 -07:00
George Hotz
2035b89e54
wooo 1k lines
2022-07-08 08:44:57 -07:00
George Hotz
2720ef49ca
extra and test and tuple
2022-07-07 10:01:33 -07:00
George Hotz
d5de8452c6
dashed loadops
2022-07-04 09:50:56 -07:00
George Hotz
c3d13893f9
add SHUFFLE_MOVEMENT_OPS, exactly 1000 lines
2022-07-03 16:30:42 -07:00
George Hotz
57ebce8d67
first LazyBuffer optimizations
2022-07-03 15:09:16 -07:00
George Hotz
a1a20891ef
more types
2022-07-03 14:03:34 -07:00
George Hotz
99b287ed87
typechecks
2022-07-03 13:54:30 -07:00
George Hotz
cdf2be74f9
add neg
2022-07-03 13:04:58 -07:00