Commit Graph

2496 Commits

Author SHA1 Message Date
chenyu
b89ee1ac83 lazy type annotation and cleanups (#1897) 2023-09-22 14:20:23 +08:00
George Hotz
78576915de Add needed contiguous to DiskBuffer. SHM support on OSX (#1891)
* add some contiguous

* remove second contig

* Revert "remove second contig"

This reverts commit fc164f7dca1ad75b1e466e4e45a05eca58b7e0e0.

* shm on osx

* can repro bug

* don't contig zeros and ones
2023-09-22 09:16:42 +08:00
qazal
d0e752003d fixes (#1893) 2023-09-22 07:20:27 +08:00
wozeparrot
009a99a0b1 feat: way cleaner hip wrapper (#1895) 2023-09-22 07:20:03 +08:00
Yixiang Gao
cb5d6576cb cifar step time 65ms while stay above 94% (#1888)
* change reduceop heruistics

* add model ema and jit hack

* add ema eval

* have to create a duplicate eval function for jit

* remove manual seed

* 94% achieveable with normal eval

* ema is outputting the same results as normal

* fix ema bug

* ema achieves 94% with fix seed

* multigpu tested

* constant fold decay, fix jit, adjust message for multigpu

* pull SpeedyResNet out of train_cifar()
2023-09-21 11:19:32 +08:00
kormann
864746d6aa polish print_tree (#1868)
* fix

* isinstance
2023-09-21 11:13:10 +08:00
chenyu
a5090f0ee9 remove NumNode.int() (#1876) 2023-09-21 10:29:16 +08:00
Gijs Koning
9eb6310686 Fix gpt optimization (#1885)
* fix for gpt

* the actual fix

* Remove change in symbolic

* small comment
2023-09-21 10:28:18 +08:00
Szymon Ożóg
bd3444797b make ssa assign r[u] (#1887) 2023-09-21 10:20:20 +08:00
nimlgen
9450e41f70 no import when Python is shutting down (#1875) 2023-09-20 12:47:02 -04:00
Yixiang Gao
84ab47a90a add branch up-to-date check (#1879) 2023-09-20 12:41:51 -04:00
nimlgen
504bb6d0ea support symbolic jit in HIP (#1877) 2023-09-20 01:44:26 -04:00
chenyu
cd66c9e249 no numnode in shape (#1871) 2023-09-17 07:49:45 +08:00
Yixiang Gao
18ec5a9e09 add comment bot to CI (#1873) 2023-09-16 12:22:06 -04:00
Yixiang Gao
a27f6c7d62 add diff mode to sz.py (#1872) 2023-09-16 00:43:47 -04:00
nimlgen
4c31dfafb3 add seed to gpt-2 (#1869) 2023-09-15 17:34:14 -04:00
wozeparrot
c870764940 Revert "add line changes diff bot to CI (#1863)" (#1870) 2023-09-15 16:56:42 -04:00
Yixiang Gao
789c84a7a3 add line changes diff bot to CI (#1863) 2023-09-15 16:29:58 -04:00
chenyu
29ac8293d7 run gpt2 in CI (#1866) 2023-09-15 04:37:02 +08:00
chenyu
1b46de1a3e fix type of helpers.prod, add test cases (#1859) 2023-09-14 05:16:55 +08:00
chenyu
e67306ba04 symbolic shape type with TypeGuard (#1852) 2023-09-13 05:27:22 +08:00
Roelof van Dijk
c91b44f7bf refactor: move size to view (#1848)
* refactor: move size to view

* fix: pylint

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-09-11 07:16:04 -07:00
chenyu
9e9ea20784 Fix view, CI cpu test with python 3.8 (#1845) 2023-09-10 22:37:58 -04:00
chenyu
3ec301c2d7 apply view.py patch (#1844) 2023-09-10 17:32:15 -07:00
Yixiang Gao
a32951a001 add test_tensor_copy (#1840)
* add  test_tensor_copy

* fix whitespace

* add value check
2023-09-10 16:01:58 -07:00
Roelof van Dijk
1bc52c60df fix: minor tweaks to view (#1842)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-09-10 15:55:57 -07:00
George Hotz
47e602f717 view: do not trade complexity for speed (#1839)
* view: do not trade complexity for speed

* staticmethods

* view create
2023-09-10 11:29:53 -07:00
chenyu
c0bc4cfbaf DivNode.b is int (#1833) 2023-09-10 09:04:29 -07:00
nimlgen
13790b1e20 cast types in render_load (#1837) 2023-09-10 07:58:13 -07:00
David Hou
e74a6ca7e4 expand in terms of substitute (#1827) 2023-09-09 14:43:00 -07:00
George Hotz
0e3e2bac13 amd wino: upload results 2023-09-09 13:57:14 -07:00
George Hotz
6f95c5f284 winograd speed test for AMD (#1826) 2023-09-09 13:56:33 -07:00
George Hotz
0f2bd10d00 add winograd CIFAR to mac tests (#1825)
* add winograd CIFAR to mac tests

* symlink already done
2023-09-09 13:45:24 -07:00
nimlgen
31fca43706 kopt works with local+grouped reduce and tests (#1824) 2023-09-09 13:22:09 -07:00
chenyu
9da40c8448 move Node.__lt__ SumNode special case to SumNode (#1823) 2023-09-09 13:20:38 -07:00
Francis Lam
651205fa5c linearizer: support local and group_for_reduce dimensions together (#1821)
also minor changes to test_speed_v_torch.py and size of UOps.SPECIAL
2023-09-08 12:39:27 -07:00
segf00lt
9e8c1dbf34 patch to remove hack from stable_diffusion.py (#1814)
* patch to remove hack from stable_diffusion.py

* sorry linter

* realize after assign?

* float16 broken in llvmlite use float64 for now

* int32

* idiot forgot to change test array dtype
2023-09-08 09:26:50 -07:00
chenyu
ebcda8a714 Move var_vals from ShapeTracker to LazyBuffer (#1819) 2023-09-08 09:25:10 -07:00
kormann
7ac65a93b4 utils.printtree (#1816)
* utils.printtree

* linter compliance

* rename to print_tree
2023-09-07 23:08:57 -07:00
George Hotz
4613c9e77c add tvm example, formatting (#1813)
* add tvm example

* no realize
2023-09-07 11:50:41 -07:00
nimlgen
5b15a972b5 no functions with same names in test/ (#1811) 2023-09-07 11:27:31 -07:00
George Hotz
722823dee1 stable diffusion: force fp16 free 2023-09-06 15:11:05 -07:00
chenyu
928cb1a64a AndNode.substitute short circuit (#1800)
* AndNode substitute short circuit

* Node.__bool__ is faster than Node.__eq__
2023-09-06 14:58:49 -07:00
nimlgen
a78a1fa499 fix jit buffer reuse when freed (#1802)
* fix jit buffer reuse when freed

* Firbid output_buffer reusage
2023-09-06 14:41:57 -07:00
Yixiang Gao
22cf15e9d0 convert function into tinygrad (#1803) 2023-09-06 14:41:26 -07:00
Pavol Rusnak
52a92bf95d use class Foo: instead of class Foo(): (#1797)
* use class Foo: instead of class Foo():

* add ruff linter, copy settings from .flake8 to ruff.toml
2023-09-06 12:20:25 -07:00
badcc
fd25792c8b Ensure freqs as type float32 in freqs_cis (#1798) 2023-09-06 10:24:15 -07:00
chenyu
35072877ef sym_infer is noop for int input (#1795) 2023-09-06 09:17:20 -07:00
George Hotz
f67638b27a delete broken DDPG example 2023-09-06 08:01:12 -07:00
George Hotz
78a43ad2c7 add uop fixup (#1793) 2023-09-06 07:55:22 -07:00