Commit Graph

7979 Commits

Author SHA1 Message Date
George Hotz
0efe1e435f no need to render to check valid 2023-02-10 15:35:12 -06:00
Kirill
27154db99a Downloads weights in examples/stable_diffusion.py (#537)
* Downloads weights in examples/stable_diffusion.py

* use download_file_if_not_exists in fetch

* make consistent with previous NOCACHE behavior
2023-02-10 14:37:04 -06:00
George Hotz
4459cde68b minor matmul location cleanup 2023-02-10 14:12:42 -06:00
George Hotz
a007145ac4 oops, should be __itruediv__. we should add test for this 2023-02-10 14:04:40 -06:00
Jacky Lee
5c51ae8dbf Show where tinygrad is faster in speed test vs torch (#549)
* show where tinygrad is faster

* don't change text color
2023-02-10 14:01:07 -06:00
George Hotz
87a7717222 LLVM backend uses shapetracker 2023-02-10 13:53:33 -06:00
George Hotz
c3cf17c6d0 Symbolic render (#550)
* render symbolic

* valid

* fix shapetracker tests

* render_python is the default

* expr is gone

* remove legacy behavior
2023-02-10 13:22:26 -06:00
George Hotz
5ed3622965 add dump to kernel_search 2023-02-10 12:13:30 -06:00
Jacky Lee
f08187526f Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
Lucas Keller
56a06280c5 Testing/utils (#548)
* New unittest for utils.py

Unit test fetch in basic ways. Would have tested more fetches, but
downloading stuff for tests is annoying and mocking is more
dependencies.

* Remove unused imports
2023-02-10 12:08:20 -06:00
George Hotz
1257b0433a should fix tests 2023-02-09 13:12:14 -06:00
George Hotz
e6f19d4ce2 assume all generic exec ast have ProcessingOp 2023-02-09 13:03:48 -06:00
George Hotz
78795e3507 reduce line count by simplifying DeviceBuffer 2023-02-09 12:52:14 -06:00
George Hotz
5de850f6d5 assign buffer reuse (#547)
* assign buffer reuse works

* fix assign for torch and cpu

* allow assign from numpy

* fix llvm output_buffer

* add some assign tests

* fix assignment test

* test should fail without lazy

* env var to disable assign
2023-02-09 11:53:02 -06:00
George Hotz
473bbd3e35 fix graphs 2023-02-09 09:40:46 -06:00
George Hotz
16a7edc775 move base_fxn_for_op to ops_cpu 2023-02-08 18:23:48 -06:00
George Hotz
c642f5e72b less lines for torch 2023-02-08 18:15:59 -06:00
George Hotz
58a03eb693 generic processing op 2023-02-08 18:09:17 -06:00
George Hotz
4c2faa4140 functools.partial keeps mypy compiler working 2023-02-08 18:04:32 -06:00
George Hotz
cfd13c083b refactor GenericShape for a big line reduction 2023-02-08 18:01:08 -06:00
George Hotz
c656513591 GPURunner class will replace CL cache eventually 2023-02-08 17:31:36 -06:00
George Hotz
a5a55ac19e GlobalCounters cache + assign in optim 2023-02-08 17:10:55 -06:00
George Hotz
d9555bc478 that turned out to be dumb 2023-02-08 16:52:29 -06:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
8c8a5a77dd refactor llvm into runtime and ops 2023-02-08 16:28:32 -06:00
George Hotz
45ce4de6f3 improve typing 2023-02-08 12:48:21 -06:00
George Hotz
2e1bdc889a write out all the functions, no auto binding (#543)
* write out all the functions, no auto binding

* cleanups, more types

* Slice is for internal calls only

* improve typing

* ugh, put slice back
2023-02-08 12:41:39 -06:00
George Hotz
d854337f0d nn/optim.py compiles now 2023-02-08 11:25:18 -06:00
George Hotz
1029deccb1 refactor ops_cpu and ops_torch to not share code 2023-02-08 11:11:42 -06:00
George Hotz
ee18420c13 dyn add of math ops 2023-02-08 10:04:30 -06:00
George Hotz
2844482a60 Mypy fun (#541)
* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup
2023-02-08 09:56:51 -06:00
George Hotz
996e0a10b7 update cpu and torch to hold buffers (#542)
* update cpu and torch to hold buffers

* save lines, and probably faster
2023-02-08 09:40:45 -06:00
Mitchell Goff
ae4f0aeb5f NumPy-like semantics for Tensor.__getitem__ (#506)
* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis
2023-02-08 08:59:46 -06:00
George Hotz
0ac3286af0 factor out Device 2023-02-07 16:08:20 -06:00
George Hotz
2aeebd70a6 mypy will compile the shapetracker, no speed up 2023-02-07 15:43:44 -06:00
George Hotz
185d2e3678 fix map_buffer and add some __slots__ 2023-02-07 15:32:48 -06:00
George Hotz
aebe75d9a2 remove val expansion (#539)
* remove val expansion

* types for all shapetracker functions:

* more typing

* add all the parens to the test

* more types

* fix tests

* very minor speedup
2023-02-07 15:14:05 -06:00
George Hotz
001cc96e25 Lazy refactor (#538)
* refactor lazy to return ASTs

* a lil cleaner

* oops, compare ids

* gate on GRAPH

* cleanups

* less calls to log_op

* simpler

* realize_buffers -> map_buffers

* even simpler

* think in asts

* a lil cleaner

* NOOP means contiguous
2023-02-07 11:53:21 -06:00
George Hotz
02d8cb0959 lazy cleanup 2023-02-07 07:39:53 -06:00
George Hotz
d93563f39f fix KOPT 2023-02-07 06:56:33 -06:00
Jared Z
7604b17fbf TestZeroViewShapeTracker fix test (#481)
* TestZeroViewST test

* updated to align with st naming conventions in file

* Update test_shapetracker.py
2023-02-07 06:17:55 -06:00
George Hotz
c073271f20 more symbolic correctness 2023-02-07 00:03:14 -06:00
George Hotz
e961fd3a04 more symbolic test, ModNode is wrong 2023-02-06 23:43:21 -06:00
George Hotz
8cfeb118d6 symbolic new test 2023-02-06 23:27:26 -06:00
George Hotz
7c5a5ecdac even simpler symbolic 2023-02-06 22:47:00 -06:00
George Hotz
8b05de1841 symbolic cleanups 2023-02-06 22:12:11 -06:00
George Hotz
2a924e2b77 fix sz.sh for llvm 2023-02-06 15:36:05 -06:00
James Roberts
0d405fd5bc Parallelize CI tests (#535) 2023-02-06 15:27:44 -06:00
Andrey
4977d6f225 using tuples in isinstance (#534) 2023-02-06 14:40:26 -06:00
timmermansjoy
d56c57b112 adding more robust install method (#532) 2023-02-06 13:12:05 -06:00