Commit Graph

2553 Commits

Author SHA1 Message Date
George Hotz
0945848b5f schedule the loadops like everything else (#1964)
* schedule the loadops like everything else

* unify loadops with other things we schedule

* delete all the ops

* fix symbolic jit
2023-10-04 02:36:04 -07:00
Ahmed Harmouche
fb4d830a2a Fix cast error in render_load in wgsl (#1956)
* Fix cast error in wgsl

* User render_cast intead of introducing new method

* Make it shorter

* Add back webgpu tests: efficientnet and dtypes
2023-10-04 02:29:14 -07:00
George Hotz
6a79d4044a unrealized consts everywhere (#1963)
* unrealized consts everywhere

* don't import device from lazy

* Device isn't in Lazy

* same issue

* disable jit random
2023-10-04 01:48:10 -07:00
nimlgen
f04c1a63ae Rand works in jit (#1960)
* rand works in jit

* better jitted rand creation

* Update realize.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-03 12:55:25 -07:00
George Hotz
f64d5b3ba8 move to realize.py (#1961)
* move to realize.py

* run_schedule moved
2023-10-03 07:25:40 -07:00
George Hotz
717451a244 Revert "optimizer: add matvec optimizations (#1753)" (#1959)
This reverts commit f520323054.
2023-10-03 00:28:42 -07:00
Francis Lam
f520323054 optimizer: add matvec optimizations (#1753)
* optimizer: add matvec optimizations

* Update optimizer.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-03 00:01:59 -07:00
nimlgen
e1f2c2cc19 fix jitted dist (#1955) 2023-10-02 11:45:13 -04:00
Roelof van Dijk
35ac60775b simplify line (#1950)
* no need to index here, zip automatically truncates

* enumerate is faster

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-02 03:19:15 -07:00
nimlgen
08e884217c metal batch executor (#1920)
* metal batch executor

* no sym_infer in backends

* calc_stat in BasicBatchExecutor`

* run in batches of size 8

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-02 03:18:31 -07:00
George Hotz
d48a90859c use the opts from the default device (#1954) 2023-10-02 03:13:46 -07:00
nimlgen
c27971d51f fix llvm nan/inf const (#1951)
* allow llvm

* llvm works with inf/nan

* enable some fast math back

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-02 03:08:57 -07:00
George Hotz
6a4ec4776e fix CI (#1953)
* this work

* unauth

* update in all places
2023-10-02 02:58:58 -07:00
Daniel Riege
579cabf668 Fix examples/train_efficientnet (#1947)
* added missing colon

* bug fixes for cifar10 dataset loading
needed a reshape to work with conv layers and resolve fetched tensor to numpy since further code expects numpy array
2023-10-02 02:23:38 -07:00
David Hou
d4671cd8e3 use schedule in more places in linearizer tests (#1946)
* pass current linearizer opts to Linearizer in TestFloat4

* use schedule instead of exec_ast hook
2023-10-02 02:22:56 -07:00
Roelof van Dijk
e7a49e84c8 perf: assert behind if is not optimized (#1847)
* perf: assert behind if is not optimized

* Update helpers.py

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-09-29 11:07:24 -07:00
David Hou
8e9db88474 expand after expr_idxs in Linearizer.global_load (#1818)
* small changes

* expand in terms of substitute, directly expand g_idxs g_valid

* delete expand_ops

* don't compare using hash

* any instead of in

thanks gijskoning

Co-authored-by: Gijs Koning <gijs-koning@live.nl>

* support tc

* testing code

* no more create_rednode

* maxsize none in view/node

* oops

* undo

* typing

* oops

* oops

* lmao

* lmao

* add expand multi test

* Node.iter_idxs

* type

* type

* delete checks!

* clean up a little?

* expand_idx in symbolic

* un-golf

* play around with types >.>

* test_substitute and also remove an incorrect test?

* get rid of range

* Update symbolic.py

* split out view cache change

* split out flat components change

* reduce diff

* reduce diff

* add some float4 tests

* fix

---------

Co-authored-by: Gijs Koning <gijs-koning@live.nl>
2023-09-29 10:33:34 -07:00
nimlgen
692bec7b6f simplify CacheCollector (#1944)
* rewrite cc

* fix

* fix tests

* fix all tests

* is it better

* better with shape

* cleaner

* linter fix

* no ;

* better comment

* better comments

* no thneed changes
2023-09-29 10:13:04 -07:00
George Hotz
90326dbdc3 resnet50 hand coded optimization (#1945)
* resnet50 hand coded opt

* hand optimize one kernel

* opt in both places to fix test
2023-09-29 09:34:51 -07:00
George Hotz
a677a1e2cd winograd test prints op count 2023-09-29 05:41:29 -07:00
George Hotz
4ff35e2b97 better resnet eval (#1943) 2023-09-29 05:40:25 -07:00
George Hotz
48c8d130ae simpler GPT2 (#1941)
* don't realize in gpt2

* simpler gpt2
2023-09-29 04:41:09 -07:00
George Hotz
81cb120b0f winograd speed test (#1942) 2023-09-29 04:40:35 -07:00
George Hotz
d52df788d3 remove RawConst and add test (#1939) 2023-09-29 01:21:51 -07:00
George Hotz
22b8576887 more lazy cleanup (#1938)
* small lazy cleanups

* a few more

* cleanups

* no more realizing in the scheduler test

* a few more minor things

* that was just wrong

* fix graph. the graph test was completely useless

* make graph usable

* fix op graph
2023-09-29 00:53:29 -07:00
nimlgen
2a49f7e456 fix transfer to mapped buffers (#1923) 2023-09-29 00:50:24 -07:00
Francis Lam
f445e056ed wmma: add test and tensor core shape (#1925) 2023-09-28 18:04:28 -07:00
Yixiang Gao
094d3d71be with Tensor.train() (#1935)
* add with.train

* remove the rest TODOs

* fix pyflake

* fix pyflake error

* fix mypy
2023-09-28 18:02:31 -07:00
Yixiang Gao
10f0dc0c85 keep only one comment from git action bot (#1936) 2023-09-28 20:24:53 -04:00
wozeparrot
70671d9625 fix test_collectives (#1934)
* fix: fix test_collectives.py

* feat: reenable test_collectives
2023-09-28 11:02:22 -07:00
George Hotz
c36d0e3bd8 tvm import hook 2023-09-28 09:24:32 -07:00
George Hotz
adab724caa schedule2, keep the tests working with small changes (#1932)
* lazy cleanups

* ast functions take in LazyOps

* op instead of self.op

* _base for mops

* fix contiguous

* start schedule

* test_schedule

* fix openpilot

* more tests

* bugfix and test skip

* work

* make sure things get freed

* fix zerosized tensors

* fix failing test

* fix ceil and friends

* fix openpilot

* disable training

* disable test collectives
2023-09-28 09:14:43 -07:00
Antoine Adam
c6d5e471d0 Do not import typing_extensions at runtime (#1927)
https://github.com/tinygrad/tinygrad/pull/1852 introduced typing_extensions as a runtime requirement, but the package is only noted as a requirement for linting. So trying to use `python -c 'from tinygrad.tensor import Tensor'` after `pip install -e .` on python 3.11 will fail.

It seems that this does not happens before 3.11 only because typing_extensions was a downstream dependency of pyopencl. Anyway this commit makes it clear that typing_extensions is only needed for linting, as written in setup.py.
2023-09-28 01:57:28 -07:00
nimlgen
164f8a1923 fix hipgraph exec (#1929) 2023-09-27 12:54:21 -04:00
Sean D'Souza
9c6bb7ff13 fix: add sentencepiece to testing dependencies (#1919) 2023-09-25 11:22:01 -04:00
George Hotz
c907efbf4a reorder a few things (#1915)
* reorder a few things

* huh, that has to be there

* move apply shapetracker

* BufferOps

* only for type checking
2023-09-25 10:17:21 +08:00
chenyu
25a767cd5d Remove LtNode.__mul__ and AndNode.__mul__ (#1913) 2023-09-25 07:03:59 +08:00
chenyu
eaa8d343d8 Remove str type from map_buffers (#1912) 2023-09-25 07:03:22 +08:00
Dat D. Nguyen
ae9529e678 chore: remove redundant noise in stable diffusion example (#1910) 2023-09-24 21:33:45 +08:00
George Hotz
6d9065ed1c Minor cleanups (#1911)
* cleanups

* remove that simplify
2023-09-24 21:32:50 +08:00
George Hotz
20059dc55b Make ShapeTracker Immutable (#1909)
* ugh

* ops test pass

* fix shapetracker tests

* sym shapetracker

* shapetracker is a tuple of views now

* from_shape

* fix has variable shape

* key isn't needed

* post init assert
2023-09-24 21:09:03 +08:00
nimlgen
45f02393f0 HipGraph support (#1880)
* init hip graph

* optimize args update

* cache symbolic in jit

* remove NOSTAT

* init BasicBatchExecutor

* symbolic infer cache per jit instance

* basicbatchexec is defualt for compiled

* batch_exec is taken from ASTRunner

* no infer cache

* batched execution of hip graph

* add comment about hip graph batches

* readable hip graph
2023-09-24 20:14:36 +08:00
George Hotz
7ff7aacdb4 LazyOp out of Linearizer (#1908)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx

* dedup buffers

* hmm, bugfix

* fix tensor cores

* opts device
2023-09-24 14:30:53 +08:00
qazal
2201b46bce Refactor Conv2d/ConvTranspose2d into a single parent class (#1906)
* refactor Conv2d/ConvTranspose2d

* raise in __call__ for the parent class

* use ABC

* drop ABC it's just syntactic sugar

* use conv2d as base for the transposed version
2023-09-24 14:23:41 +08:00
George Hotz
97dc813329 Revert "All LazyOps in the Linearizer (#1905)" (#1907)
This reverts commit a5820390db.
2023-09-24 11:51:22 +08:00
George Hotz
a5820390db All LazyOps in the Linearizer (#1905)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx
2023-09-24 11:50:00 +08:00
George Hotz
0f373b8b47 cache more uops (#1904)
* cache more uops

* fix cacheable
2023-09-23 16:50:13 +08:00
George Hotz
1e15fdaee7 disable flaky triton test 2023-09-23 14:59:36 +08:00
George Hotz
0571dd7627 move all int (#1903) 2023-09-23 14:43:45 +08:00
nimlgen
41aea3ad36 require C-contiguous array for hip._copyin (#1902) 2023-09-23 14:36:59 +08:00