Commit Graph

194 Commits

Author SHA1 Message Date
chenyu
d0eb1540d5 helpers.diskcache_clear (#4436)
drop all tables in diskcache. added a unit test but disabled it by default because it will drop all cache...
2024-05-05 14:19:01 -04:00
George Hotz
cb7289f9c9 remove clang program header (#4422)
* remove clang program header

* proper max

* bools are numbers

* fix compile enet
2024-05-04 08:38:01 -07:00
George Hotz
9fc4465557 subbuffer support (#4397)
* subbuffer support

* diskbuffer offset

* cuda subbuffer works

* use subbuffer

* more subbuffer tests

* consecutive

* cast

* consec

* offset

* view is a better name

* offset is in nbytes

* fix view + memory planner

* delete unused DiskRunner

* reverse order

* no subbuffers on unrealized consts

* only enabled for disk

* don't reverse memory

* view supported devices

* pickle buffer view

* ring jit

* support extra view inputs in jit

* fix JIT=2 issue

* test copy jit

* p2p isn't an option anymore

* fix dep tracking issue

* fix mypy

* fix pickle

* from_nv is contents now
2024-05-03 18:05:57 -07:00
George Hotz
2786dff26d new disk tensor tests (#4393) 2024-05-02 08:54:44 -07:00
George Hotz
bd49d2854a hotfix: skip fetch tests always 2024-05-01 08:43:26 -07:00
George Hotz
27ee49bf30 tensor variable (#4362)
* tensor variable support

* consttype without variable?

* __setitem__

* symbolic mean works

* arange test

* more tests

* a few more tests
2024-04-30 14:08:57 -07:00
George Hotz
d325be2540 update docs (#4356)
* update docs

* nn.md

* mnist cleanups

* rhip test is very slow
2024-04-30 16:51:42 +09:00
Obada Khalili
e4befa41d7 Fix in _reshape_mask (#4332)
* handle reshape with remainder in _reshape_mask

* remove trailing whitespce

* use helper_test_op to generate tensors from shapes

* test in shapetracket too

* remove whitespace

* revert property name in other class tests
2024-04-28 11:57:39 -04:00
George Hotz
b6e7243bfa hotfix: skip slow pre-commit test 2024-04-16 11:48:43 +04:00
chenyu
f6c8032e5d assert if expr_idxs return might be outside of int32 (#4157) 2024-04-12 14:18:35 -04:00
uuuvn
8a40d7d423 Shape changing bitcast and assert bitcast in disk (#3973)
* Shape changing bitcast

* only support it on disk

* basic test

* more tests

* RuntimeError instead of assert

* create unique temp files

* move tests that use disk to test_disk_tensor

* linter

* remove assert on error messages

* that's RuntimeError now

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-03-28 21:49:10 -07:00
chenyu
519336cfea factor out partial in SumNode div int (#3841)
* factor out partial in SumNode div int

* div not rem

* space
2024-03-20 16:34:33 -04:00
chenyu
455f7bea9b test example from half resnet that idx has number outside of int32 (#3838)
* test example from half resnet that idx has number outside of int32

* ruff
2024-03-20 13:44:20 -04:00
Patrick Tsai
b436c9792f Fix factoring bug (O(n) arange related) (#3817)
* Factoring bug

* Another one in case

* It works now so change tests back

* large arange cumsum optimization

* More cleanup

* symbolic no factor div test

* name change

* Rename test

---------

Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>
2024-03-19 11:49:42 -04:00
wozeparrot
a0ab755317 threefry again (#3785)
* feat: initial xor

* feat: initial threefly

* feat: remove custom random

* fix: really need to install precommit

* feat: lmao forgot that this is rotate not a shift

* clean: put that there

* feat: numpy xor

* feat: quick test for xor

* feat: llvm xor

* feat: slightly working xor in torch

* feat: rand works in jit

* clean: save a line

* feat: match jax

* feat: maybe test against jax

* feat: requires_grad

* fix: fix test_symbolic_ops

* feat: lower alpha

* feat: just pad

* fix: maybe fix training tests?

* fix: fix some llvm stuff

* feat: cursed realize on the way out

* feat: testing jax

* fix: why is the jax install process not simple

* fix: maybe passing test

* fix: symbolic workarounds

* clean: still need that precommit

* fix: aaaa

* fix: more test fixes

* fix: quick fix for wgsl

* feat: need to set requires_grad on the final tensor

* feat: one more tensor

* feat: don't take forever

* feat: seeing y ci is brok

* feat: can't allocate 64GiB lmao

* fix: fix this

* feat: hope this doesn't break smth before i go to bed

* feat: don't destroy ram

* feat: int

* feat: remove jax

* feat: properish workaround?

* feat: skip slow webgpu tests

* feat: no longer fails

* feat: use dtypes

* feat: real number

* fix: torch

* fix: don't test against reference for torch

* feat: to device

* feat: fix advanced indexing

* feat: correct casting

* feat: even rng_counter

* feat: match master

* feat: this was actually bad

* fix: maybe?

* feat: store

* feat: remove realizes

* feat: somehow this is important

* feat: somehow this is also important

* feat: save a line

* fix: don't need that anymore

* feat: restore this

* fix: linter

* feat: remove realizes

* fix: realized is in base now

* fix: add back cast

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: :(

* fix: :(

* fix: not being dumb

* feat: try changing less tests

* feat: shouldn't have to change that

* feat: contiguous bumps it by one

* fix: hmm

* fix: numpy memory moment

* fix: cl_khr_fp16

* fix: torch has different tensor count

* fix: missing contiguous

* hmm: hmm

* fix: some fixes

* fix: typing

* feat: dont do that

* feat: typing fixes

* feat: why is this realize required?

* feat: ngl kinda odd typing

* feat: oh

* feat: remove realizes

* feat: why is this realize required?

* fix: hacky patch for cudacpu

* fix: without this realize pytest crashes?????

* fix: shorter line

* fix: cudacpu fixes

* fix: cudacpu fixes

* feat: real buffer

* feat: don't search when searching lmao

* fix: can't use contiguous things

* fix: no more 100GB arrays

* fix: revert

* fix: skip 7 and 10

* feat: working ish beam

* feat: minimize changes

* feat: seed 0 stable diffusion example changed

* fix: different on ci

* fix: no beam

* feat: make threefry optional

* fix: check value

* fix: unused import

* feat: threefry default

* fix: 5d

* feat: allow non upcast div

* fix: 5d better

* fix: 5d better

* fix: save all dtype

* feat: proper error

* feat: lazyop key

* fix: check float

* feat: try removing this realize now

* feat: disable threefry for uops hip tensor cores

* feat: don't need that

* feat: only check upcast

* fix: disable threefry for some metal tests

* feat: disable for metal tensor uops as well

* feat: disable for most uops

* fix: disable threefry for new uops tests

* feat: multitensor

* fix: typing

* feat: threefry default off

* feat: skip threefry half rand

* feat: restore old

* fix: bad git

* clean: ruff

* feat: bfloat16 fix

* fix: :|

* feat: restore old

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-03-18 16:47:07 -04:00
chenyu
639bd5dbfc move bf16 cast hack to Tensor.llvm_bf16_cast (#3788) 2024-03-17 18:51:22 -04:00
George Hotz
311cf2b7d3 Revert "threefry_2x32 (#2601)" (#3784)
This reverts commit db3de54bc4.
2024-03-17 10:27:20 -07:00
wozeparrot
db3de54bc4 threefry_2x32 (#2601)
* feat: initial xor

* feat: initial threefly

* feat: remove custom random

* fix: really need to install precommit

* feat: lmao forgot that this is rotate not a shift

* clean: put that there

* feat: numpy xor

* feat: quick test for xor

* feat: llvm xor

* feat: slightly working xor in torch

* feat: rand works in jit

* clean: save a line

* feat: match jax

* feat: maybe test against jax

* feat: requires_grad

* fix: fix test_symbolic_ops

* feat: lower alpha

* feat: just pad

* fix: maybe fix training tests?

* fix: fix some llvm stuff

* feat: cursed realize on the way out

* feat: testing jax

* fix: why is the jax install process not simple

* fix: maybe passing test

* fix: symbolic workarounds

* clean: still need that precommit

* fix: aaaa

* fix: more test fixes

* fix: quick fix for wgsl

* feat: need to set requires_grad on the final tensor

* feat: one more tensor

* feat: don't take forever

* feat: seeing y ci is brok

* feat: can't allocate 64GiB lmao

* fix: fix this

* feat: hope this doesn't break smth before i go to bed

* feat: don't destroy ram

* feat: int

* feat: remove jax

* feat: properish workaround?

* feat: skip slow webgpu tests

* feat: no longer fails

* feat: use dtypes

* feat: real number

* fix: torch

* fix: don't test against reference for torch

* feat: to device

* feat: fix advanced indexing

* feat: correct casting

* feat: even rng_counter

* feat: match master

* feat: this was actually bad

* fix: maybe?

* feat: store

* feat: remove realizes

* feat: somehow this is important

* feat: somehow this is also important

* feat: save a line

* fix: don't need that anymore

* feat: restore this

* fix: linter

* feat: remove realizes

* fix: realized is in base now

* fix: add back cast

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: :(

* fix: :(

* fix: not being dumb

* feat: try changing less tests

* feat: shouldn't have to change that

* feat: contiguous bumps it by one

* fix: hmm

* fix: numpy memory moment

* fix: cl_khr_fp16

* fix: torch has different tensor count

* fix: missing contiguous

* hmm: hmm

* fix: some fixes

* fix: typing

* feat: dont do that

* feat: typing fixes

* feat: why is this realize required?

* feat: ngl kinda odd typing

* feat: oh

* feat: remove realizes

* feat: why is this realize required?

* fix: hacky patch for cudacpu

* fix: without this realize pytest crashes?????

* fix: shorter line

* fix: cudacpu fixes

* fix: cudacpu fixes

* feat: real buffer

* feat: don't search when searching lmao

* fix: can't use contiguous things

* fix: no more 100GB arrays

* fix: revert

* fix: skip 7 and 10

* feat: working ish beam

* feat: minimize changes

* feat: seed 0 stable diffusion example changed

* fix: different on ci

* fix: no beam

* feat: make threefry optional

* fix: check value

* fix: unused import

* feat: threefry default

* fix: 5d

* feat: allow non upcast div

* fix: 5d better

* fix: 5d better

* fix: save all dtype

* feat: proper error

* feat: lazyop key

* fix: check float

* feat: try removing this realize now

* feat: disable threefry for uops hip tensor cores

* feat: don't need that

* feat: only check upcast

* fix: disable threefry for some metal tests

* feat: disable for metal tensor uops as well

* feat: disable for most uops

* fix: disable threefry for new uops tests

* feat: multitensor

* fix: typing

* feat: threefry default off

* feat: skip threefry half rand

* feat: restore old

* fix: bad git

* clean: ruff

* feat: bfloat16 fix

* fix: :|

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-03-17 10:19:33 -07:00
George Hotz
53adcb34f5 remove hip backend (#3783)
* remove hip backend

* remove unused

* rhip

* more RHIP
2024-03-17 10:12:16 -07:00
chenyu
a2d3cf64a5 move is_dtype_supported to test.helpers (#3762)
* move is_dtype_supported to test.helpers

updated all places that check if float16 is supports

* fix tests
2024-03-15 14:33:26 -04:00
George Hotz
69ca7f7bf9 changes for teenygrad (#3665)
* changes for teenygrad

* upd

* simpler test
2024-03-09 15:30:34 -08:00
chenyu
8f10bfa2ff ban __bool__ on Tensor (#3632)
* ban __bool__ on Tensor

avoid misuse

* test case

* fix tests

* fix more tests
2024-03-06 17:12:35 -05:00
chenyu
968d109453 apply more create_lt_node (#3597)
updated one in linearizer if condition, and various symbolic tests
2024-03-03 16:12:39 -05:00
Patrick Tsai
0082300a59 Fix symbolic negative floordiv (#3594)
Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>
2024-03-03 11:40:52 -05:00
chenyu
e09619ab6c explicitly create_lt_node when used in shapetracker _expr_view (#3561)
* explicitly create_lt_node when used in shapetracker

leave regular __lt__ and cmps for symbolic shape cmp

* hmm it fixed that?

* LtNode.substitute uses create_lt_node
2024-03-03 10:08:21 -05:00
Marcin Słowik
f90caa4b92 Escape table name in diskcache queries. (#3543)
Some devices create cache table names with non-alphanumerical characters, e.g. "compile_hip_gfx1010:xnack-_12".
This commit escapes the table name in single quotes s.t. sqlite works (see https://github.com/tinygrad/tinygrad/issues/3538).
2024-02-29 13:04:21 -08:00
George Hotz
48918fa75a fix disktensor offset issue (#3532) 2024-02-28 17:22:17 -08:00
chenyu
0c6846f9fc failed test case for disk tensor assign into dtype int64 (#3527)
failed case for #3510, mark as expectedFailure for now
2024-02-28 17:52:21 -05:00
chenyu
88939c3347 fix Node.max can be symbolic (#3514)
Also made sure taking max twice can get int.
2024-02-27 17:21:31 -05:00
chenyu
61605ccc69 Remove special case of SumNode div SumNode (#3502) 2024-02-26 09:42:06 -05:00
George Hotz
871ba73e65 _reduce_op is axis based now (#3462)
* _reduce_op is axis based now

* axis_

* update lin failures

* disable that

* fix shape
2024-02-21 16:36:31 +01:00
chenyu
0d326a48b8 fix LtNode simplification when lhs and rhs contain same variables (#3451)
* fix LtNode simplification when lhs and rhs contain same variables

`(Variable("a", 1, 5) < Variable("a", 1, 5))` should eval to `NumNode(0)`

* fix with less perf impact
2024-02-20 09:06:55 -05:00
chenyu
2da734920e use __getnewargs__ to fix unpickling Variable (#3441)
it's recommended to use __getnewargs__ to update the args of classes that use __new__ when unpickling.
It's preferred because it does not change the __new__ behavior.
2024-02-18 10:28:37 -05:00
xarkes
28a8b72024 Remove Interpreted device & remaining CPU/TORCH ref (#3423)
* Remove Interpreted device & remaining CPU/TORCH ref

* Oops

* supports_device was useful

* Fix doc wording

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-02-16 00:30:21 -05:00
George Hotz
93eceef727 remove cpu prereqs (#3410) 2024-02-15 13:45:06 +01:00
Jyotirmaya Mahanta
b6a2600c86 fix merging condition in merge_dims (#3363)
* fix merging condition in merge_dims

* add tests

* set contiguous after mask is canonicalized

* minor fix
2024-02-12 11:50:26 +01:00
chenyu
97275101e9 fix safetensor load uint32 and uint64 (#3315)
the correct keys are U32 and U64.
2024-02-04 10:46:27 -05:00
Yoshinori Sano
edb74897b2 support safe load bf16 (#3310)
* support safe load bf16

* fix lint error E501

* add test for loading safetensors

* key should be BOOL

* fix lint
2024-02-04 10:08:39 -05:00
chenyu
d459956966 move TestGetContraction to test_helpers (#3313)
also cleaned long lines in test_shapetracker and enabled the line length check
2024-02-04 06:05:01 -05:00
Felix Wu
021eea3a52 fix UnboundLocalError when running Compiler with DISABLE_COMPILER_CACHE (#3296) 2024-02-01 21:12:33 -05:00
David Hou
3378625773 name upcast variables (#3200)
* name upcast variables

* typing

* unused
2024-01-22 11:37:28 -05:00
George Hotz
d2aab65958 remove unused expr node (#3170)
* remove unused expr node

* still works

* simple expr_idxs

* fixup typing
2024-01-18 14:18:43 -08:00
George Hotz
f0c178b7e9 move get_contraction to helpers (#3162)
* move get_contraction to helpers

* move simplify

* lines

* to_movement_ops is not generic
2024-01-17 19:13:11 -08:00
George Hotz
a464909d79 fast resnet eval (#3135)
* fast resnet eval

* fix HIP multidevice graph

* neater expression for devices

* lines

* add decorator test
2024-01-15 14:15:18 -08:00
Paul Gustafson
6bb65cd02e fix off-by-one error in st_equal (#3131)
* fix off by one error

* whitespace
2024-01-15 11:32:13 -08:00
chenyu
c658aa4fbf minor cleanup of test_disk_tensor (#3112) 2024-01-13 20:54:58 -05:00
chenyu
a300fea2a4 failed test case due to cast resets shapetracker (#3109)
cast implicitly resets shapetracker and makes it contiguous (for disk tensor), which fails for Interpreted backend if inputs contain non-contiguous st.
2024-01-13 12:46:51 -05:00
chenyu
f018a55ea1 update NumNode.__hash__ to be hash(self.b) (#3105)
with this, `a:=NumNode(x) == b` implies `hash(a) == hash(b)`
2024-01-12 19:46:21 -05:00
chenyu
dab8214103 unit tests for Device.canonicalize (#3055) 2024-01-09 12:47:20 -05:00
George Hotz
655c6f61d3 St real size (#3046)
* track the size in the lazybuffer

* shapetracker real size

* lint
2024-01-08 14:44:53 -08:00