Commit Graph

2952 Commits

Author SHA1 Message Date
George Hotz
6fb8b3bb60 move symbolic functions to shapetracker (#1901) 2023-09-23 11:45:08 +08:00
George Hotz
9cf13bd055 rename reduce_op (#1900)
* rename reduce_op

* more design v2
2023-09-23 11:27:36 +08:00
George Hotz
73a6ed7862 Apply ShapeTracker in interpreted backends (#1846)
* applying st

* tests pass

* minor cleanups

* torch too

* hack

* contiguous

* move mops

* contig in BN

* tests should pass

* make torch fast

* make zeros and ones contig by default

* no contig there

* fix padding with expanding

* might fix tests

* still doesn't fix bug, but should be there

* Revert "still doesn't fix bug, but should be there"

This reverts commit 8ea92f3e07.

* minor cleanups
2023-09-23 10:05:13 +08:00
Umut Zengin
3987280daf Fix VALIDHACKS for Images and make it default (#1832)
* valid hacks

* valid hacks

* valid hacks

* new method

* new method

* handtune

* is gate load breaking?

* lint

ruff

less junk

new approach?

maybe this?

* Make it more clear

* Make it more clear

* Will deal with the linter later

* hack for linter

* subs the idx but dont touch the valid

* Updated the mod rules

* lint hack

* I believe bug fix lets see

* Mod Node left

* revert

* Maybe this wont break?

* revert

* implemented "handtuned garbage"

* revert and use VALIDHACKS

* Lets see the CI

* still broken?

* currently its jungle

* maybe this jungle ?

* This works for everything somehow

* Added test for symbolic

* lint

* final touch

* This still works

* lint

* midway clean

* less garbage

* lint

* final form

* Slow but working way

* lint and other stuff

* lint

* mypy

* Make sure CI test Openpilot valid checks

* test if CI break

* Convert back

* refactor

* refactor

* Managed to reduce openpilot time from 30 secs to 5 secs

* Refactor

* Substitute a node with variable

* flake8

* Comment and refactor

* More comprehensive mod

* refactor

* bug fix

* More shave off

* remove not sure part
2023-09-23 07:34:43 +08:00
Gijs Koning
767bb35903 Enable symbolic ops tests for LLVM (#1898)
* Enable symbolic tests for HIP and LLVM

* Only llvm
2023-09-23 07:30:26 +08:00
Gijs Koning
b8ff20ffe4 Gpt2 (#1896)
* small helps

* got something working

* faster?

* faster yes

* cleanup

* cleanup

* cleanup

* Fix non jit

* Fix fp16 and some cleanup

* Fix fp16 and some cleanup

* cleanup

* similar to master

* cleanup
2023-09-22 20:14:47 +08:00
chenyu
b89ee1ac83 lazy type annotation and cleanups (#1897) 2023-09-22 14:20:23 +08:00
George Hotz
78576915de Add needed contiguous to DiskBuffer. SHM support on OSX (#1891)
* add some contiguous

* remove second contig

* Revert "remove second contig"

This reverts commit fc164f7dca1ad75b1e466e4e45a05eca58b7e0e0.

* shm on osx

* can repro bug

* don't contig zeros and ones
2023-09-22 09:16:42 +08:00
qazal
d0e752003d fixes (#1893) 2023-09-22 07:20:27 +08:00
wozeparrot
009a99a0b1 feat: way cleaner hip wrapper (#1895) 2023-09-22 07:20:03 +08:00
Yixiang Gao
cb5d6576cb cifar step time 65ms while stay above 94% (#1888)
* change reduceop heruistics

* add model ema and jit hack

* add ema eval

* have to create a duplicate eval function for jit

* remove manual seed

* 94% achieveable with normal eval

* ema is outputting the same results as normal

* fix ema bug

* ema achieves 94% with fix seed

* multigpu tested

* constant fold decay, fix jit, adjust message for multigpu

* pull SpeedyResNet out of train_cifar()
2023-09-21 11:19:32 +08:00
kormann
864746d6aa polish print_tree (#1868)
* fix

* isinstance
2023-09-21 11:13:10 +08:00
chenyu
a5090f0ee9 remove NumNode.int() (#1876) 2023-09-21 10:29:16 +08:00
Gijs Koning
9eb6310686 Fix gpt optimization (#1885)
* fix for gpt

* the actual fix

* Remove change in symbolic

* small comment
2023-09-21 10:28:18 +08:00
Szymon Ożóg
bd3444797b make ssa assign r[u] (#1887) 2023-09-21 10:20:20 +08:00
nimlgen
9450e41f70 no import when Python is shutting down (#1875) 2023-09-20 12:47:02 -04:00
Yixiang Gao
84ab47a90a add branch up-to-date check (#1879) 2023-09-20 12:41:51 -04:00
nimlgen
504bb6d0ea support symbolic jit in HIP (#1877) 2023-09-20 01:44:26 -04:00
chenyu
cd66c9e249 no numnode in shape (#1871) 2023-09-17 07:49:45 +08:00
Yixiang Gao
18ec5a9e09 add comment bot to CI (#1873) 2023-09-16 12:22:06 -04:00
Yixiang Gao
a27f6c7d62 add diff mode to sz.py (#1872) 2023-09-16 00:43:47 -04:00
nimlgen
4c31dfafb3 add seed to gpt-2 (#1869) 2023-09-15 17:34:14 -04:00
wozeparrot
c870764940 Revert "add line changes diff bot to CI (#1863)" (#1870) 2023-09-15 16:56:42 -04:00
Yixiang Gao
789c84a7a3 add line changes diff bot to CI (#1863) 2023-09-15 16:29:58 -04:00
chenyu
29ac8293d7 run gpt2 in CI (#1866) 2023-09-15 04:37:02 +08:00
chenyu
1b46de1a3e fix type of helpers.prod, add test cases (#1859) 2023-09-14 05:16:55 +08:00
chenyu
e67306ba04 symbolic shape type with TypeGuard (#1852) 2023-09-13 05:27:22 +08:00
Roelof van Dijk
c91b44f7bf refactor: move size to view (#1848)
* refactor: move size to view

* fix: pylint

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-09-11 07:16:04 -07:00
chenyu
9e9ea20784 Fix view, CI cpu test with python 3.8 (#1845) 2023-09-10 22:37:58 -04:00
chenyu
3ec301c2d7 apply view.py patch (#1844) 2023-09-10 17:32:15 -07:00
Yixiang Gao
a32951a001 add test_tensor_copy (#1840)
* add  test_tensor_copy

* fix whitespace

* add value check
2023-09-10 16:01:58 -07:00
Roelof van Dijk
1bc52c60df fix: minor tweaks to view (#1842)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-09-10 15:55:57 -07:00
George Hotz
47e602f717 view: do not trade complexity for speed (#1839)
* view: do not trade complexity for speed

* staticmethods

* view create
2023-09-10 11:29:53 -07:00
chenyu
c0bc4cfbaf DivNode.b is int (#1833) 2023-09-10 09:04:29 -07:00
nimlgen
13790b1e20 cast types in render_load (#1837) 2023-09-10 07:58:13 -07:00
David Hou
e74a6ca7e4 expand in terms of substitute (#1827) 2023-09-09 14:43:00 -07:00
George Hotz
0e3e2bac13 amd wino: upload results 2023-09-09 13:57:14 -07:00
George Hotz
6f95c5f284 winograd speed test for AMD (#1826) 2023-09-09 13:56:33 -07:00
George Hotz
0f2bd10d00 add winograd CIFAR to mac tests (#1825)
* add winograd CIFAR to mac tests

* symlink already done
2023-09-09 13:45:24 -07:00
nimlgen
31fca43706 kopt works with local+grouped reduce and tests (#1824) 2023-09-09 13:22:09 -07:00
chenyu
9da40c8448 move Node.__lt__ SumNode special case to SumNode (#1823) 2023-09-09 13:20:38 -07:00
Francis Lam
651205fa5c linearizer: support local and group_for_reduce dimensions together (#1821)
also minor changes to test_speed_v_torch.py and size of UOps.SPECIAL
2023-09-08 12:39:27 -07:00
segf00lt
9e8c1dbf34 patch to remove hack from stable_diffusion.py (#1814)
* patch to remove hack from stable_diffusion.py

* sorry linter

* realize after assign?

* float16 broken in llvmlite use float64 for now

* int32

* idiot forgot to change test array dtype
2023-09-08 09:26:50 -07:00
chenyu
ebcda8a714 Move var_vals from ShapeTracker to LazyBuffer (#1819) 2023-09-08 09:25:10 -07:00
kormann
7ac65a93b4 utils.printtree (#1816)
* utils.printtree

* linter compliance

* rename to print_tree
2023-09-07 23:08:57 -07:00
George Hotz
4613c9e77c add tvm example, formatting (#1813)
* add tvm example

* no realize
2023-09-07 11:50:41 -07:00
nimlgen
5b15a972b5 no functions with same names in test/ (#1811) 2023-09-07 11:27:31 -07:00
George Hotz
722823dee1 stable diffusion: force fp16 free 2023-09-06 15:11:05 -07:00
chenyu
928cb1a64a AndNode.substitute short circuit (#1800)
* AndNode substitute short circuit

* Node.__bool__ is faster than Node.__eq__
2023-09-06 14:58:49 -07:00
nimlgen
a78a1fa499 fix jit buffer reuse when freed (#1802)
* fix jit buffer reuse when freed

* Firbid output_buffer reusage
2023-09-06 14:41:57 -07:00