George Hotz
d6f4219952
LayerNorm2d for 2 lines
2023-03-20 16:58:43 -07:00
George Hotz
128ca160ac
lazy: remove required device
2023-03-20 16:31:45 -07:00
George Hotz
120d7072bd
indexing merge almost works
2023-03-20 16:17:07 -07:00
George Hotz
06abbbfe7c
remove the stupid register class ( #721 )
...
* remove the stupid register class
* touchups
* colorful display name
2023-03-20 15:45:12 -07:00
George Hotz
30b795874a
remove RMSprop, nobody uses it anymore
2023-03-20 12:31:34 -07:00
George Hotz
25287a974e
types ( #720 )
...
* types
* cleanups
* don't use None, use LocalBuffer
* eh
2023-03-20 12:31:02 -07:00
George Hotz
9b314c6342
factor uops transformers into functions
2023-03-20 08:19:48 -07:00
George Hotz
623fb1ef28
do test_conv_with_bn test
2023-03-19 23:53:56 -07:00
George Hotz
5495c7d64e
linearizer! ( #714 )
...
* linearizer outputs something
* working ish
* cstyle codegen
* clang mostly works
* fix load valid
* fix numberless loop
* fancy gen
* working
* fix enet compiler
* cleanups
* float4 upcasting
* less lines
* supports_float4
* constant folding
* mulacc
* internet tests flaky in CI
* 90% image support
* fix image generic
* bugs exposed with shapetracker and single view
* new llvm
* use vload, remove OLD
* that's really poorly done
* ending up being more lines
2023-03-19 23:43:49 -07:00
Cyril Roumégous
b629fd4cd8
add AdamW optimizer ( #716 )
...
* add AdamW optimizer
* one liner Adam optimizer
2023-03-19 12:51:06 -07:00
George Hotz
1012b68f7e
finally, some speedups
2023-03-18 18:17:33 -07:00
George Hotz
902906f909
Fix constant folding ( #713 )
...
* fix
* codegen
* contiguous is real
* no bufs_to_delete
* don't assign rawconst
* remove neg and not
* need exec to fix custom function jit
2023-03-18 17:52:46 -07:00
Fernando Vidal
73bd0b217b
add int64 as supported dtype from numpy ( #699 )
...
* add int64 as supported dtype from numpy
Without this, examples/transformer.py didn't run. With this change it runs successfully.
* Update helpers.py
* Update transformer.py
* Update training.py
2023-03-18 17:15:04 -07:00
George Hotz
f355b02987
remove comments and reorder
2023-03-18 14:48:39 -07:00
George Hotz
f5467cfedc
Devicebufferless ( #708 )
...
* runs one metal kernel
* conv2d works
* ops tests are passing
* const folding
* all ops work
* pre commit always passes
* torch works
* working still
* fix graph test
* tests passing
* image almost works
* image conv works
* most images
* fix custom
* fix assignment
* fix compile enet
* clean up comments
* fix realize return value
* include shapetracker in LB repr
* copy should make a copy
* reenable method cache
* fix lna
* dtypes in graph
* forward only for IMAGE=2
* simple realize
* getting close
* fixup new api, it's good except the kernel count
* back to 197 kernels
* tests should pass
* go to a real float
* no type_on_cpu
* fix the docs
* put shapetracker back in it's proper place
2023-03-18 14:40:23 -07:00
Kirill
26a3888ab8
Fix llama 13B RAM usage ( #710 )
2023-03-18 13:50:09 -07:00
Kirill
0fe5014b1f
Use pathlib ( #711 )
...
* Use pathlib in llama
* Use pathlib in stablediffusion
2023-03-18 13:49:21 -07:00
Connor Henderson
5e8fdfa956
Update path for test_mnist in README ( #706 )
2023-03-15 18:42:17 -07:00
George Hotz
3a8af99adb
i understand ClassVar now
2023-03-15 09:00:25 -07:00
Kirill
0532025b04
Fix llama 13B weights loading ( #700 )
...
* Fix llama 13B weights loading
* refactor more
* add test
* test storage offset
* fix spacing
* fix strides
* llama 13B working?
* yolo?
* better test for seeks
2023-03-15 08:59:52 -07:00
Pasan Perera
df48753692
fixed the import error for latest changes in master ( #705 )
2023-03-15 08:59:42 -07:00
Ayushman Kumar
e28bd11ff1
Cast Tensor data to float32 ( #703 )
...
* Cast Tensor data to float32
* astype('float32') --> Tensor.randn()
2023-03-14 23:09:41 -07:00
Jacky Lee
5e820818e9
Cast image to float32 ( #702 )
2023-03-14 08:13:19 -07:00
George Hotz
54f499b623
Move rawbuffer ( #697 )
...
* move GlobalCounters to helpers
* that's not part of the public api
* move InterpretedBuffer
* remove fromCPU from devicebuffer
2023-03-13 22:30:36 -07:00
George Hotz
cbc5a7222a
symbolic is now a 6/10 due to the infinite loop. do better.
2023-03-13 00:07:59 -07:00
George Hotz
aca244194f
bufs not none
2023-03-12 23:57:41 -07:00
George Hotz
c594a0a835
fix flip bug, add new unit tests
2023-03-12 23:55:31 -07:00
George Hotz
a4abcf0969
improve test_example
2023-03-12 22:59:40 -07:00
George Hotz
5577634cf3
tests in pre commit
2023-03-12 22:42:26 -07:00
George Hotz
ce1564b05e
fix shapetracker test
2023-03-12 22:33:25 -07:00
George Hotz
153cce0f7e
tutorial
2023-03-12 22:31:46 -07:00
George Hotz
8d16ebaea7
we have docs:
2023-03-12 19:05:44 -07:00
George Hotz
b512edc9ff
no decorators for image methods. move out RawMallocBuffer. -7 lines
2023-03-12 16:28:45 -07:00
George Hotz
ed9ab6ff03
move image to nn/image.py
2023-03-12 16:21:42 -07:00
George Hotz
fe0e8a306f
jittable llama
2023-03-12 14:15:04 -07:00
George Hotz
dcac618515
stop wasting time with the compiler. tinygrad needs to just jit
2023-03-12 12:08:46 -07:00
George Hotz
46b49d50bd
llvm was using wrong shapetracker
2023-03-12 11:49:03 -07:00
George Hotz
fdde87afda
Revert "Revert "late simplify on st""
...
This reverts commit c8508e359d .
2023-03-12 11:47:44 -07:00
George Hotz
c8508e359d
Revert "late simplify on st"
...
This reverts commit 606550474c .
2023-03-12 11:46:10 -07:00
George Hotz
606550474c
late simplify on st
2023-03-12 11:38:56 -07:00
George Hotz
de6f1695a3
only allow exact buffer name
2023-03-12 11:13:36 -07:00
George Hotz
15e0b56e39
compile works ( #688 )
...
* compile works
* runtimes
* line count
* fix custom, to tg dtype
* meh, that's fine with lazy import
2023-03-12 11:01:25 -07:00
Kirill
af7745073f
Add comments to SD ( #686 )
...
* Add explanation for empty lambdas
* Fix my_unpickle if pytorch_lightning is installed
* oops
2023-03-12 10:56:49 -07:00
George Hotz
58d3824cbe
better get_state_dict
2023-03-12 00:10:48 -08:00
George Hotz
046b3952c3
get_state_dict
2023-03-11 23:46:53 -08:00
George Hotz
6c3675c01c
_mmap loads to gpu fast
2023-03-11 23:00:13 -08:00
George Hotz
dc9a6b4bb7
fix float16 in CLANG on linux
2023-03-11 21:51:22 -08:00
George Hotz
803b0aef28
track memory for numpy/torch
2023-03-11 20:39:10 -08:00
George Hotz
37cf6fc4c0
err, external_test_opt.py broke...fusing will have to wait. correctness over speed
2023-03-11 17:54:47 -08:00
George Hotz
305b9f2d21
multistep optim tests passing
2023-03-11 17:49:53 -08:00