tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-15 01:48:23 -05:00

Author	SHA1	Message	Date
George Hotz	9a3c1e4a17	fix mul div failure (#4928 )	2024-06-12 13:58:46 +02:00
George Hotz	11a03cbbf5	don't use uops.add while constructing (#4913 ) * don't use uops.add while constructing * rebase * bugfixes * have to use BFS * prove it's late * simpler uop symbolic test (why we did this) * use dict, not set	2024-06-12 13:31:34 +02:00
George Hotz	b9afb0d577	test uop as symbolic (#4870 ) * start work * more tests passing * more tests passing * more * 34 failures * expect the failures * remove broken rule * render is fine in just the test * simplify and put in test	2024-06-09 12:15:11 +02:00
David Hou	cddce0e168	don't cast before view on shape changing bitcast (#4833 ) * don't cast before view on shape changing bitcast * make sure cast before view triggers	2024-06-04 16:04:52 -04:00
chenyu	1ffa5ec492	unit test ShapeTracker.consecutive (#4800 )	2024-06-01 10:10:51 -04:00
chenyu	59c6472b9f	check contiguous in View.create after canonicalizing mask and offset (#4770 ) mask / offset / strides can change during canonicalization, and contiguous can be True at the end	2024-05-29 11:31:13 -04:00
nimlgen	9b02aef45a	remove rhip (#4579 ) * remove rhip * remove hip runner	2024-05-14 17:58:19 +03:00
chenyu	5e036cd0b3	test unary and more reduces in test_flopcounter (#4455 ) cannot really catch a spec change error without testing the new spec explicitly, but we don't intended to change the lazy spec lightly another possible way to catch reduce flopcounter shape would be type checking InterpretedFlopCounter and throw error if `in` results in `Never`	2024-05-06 15:15:16 -04:00
chenyu	d0eb1540d5	helpers.diskcache_clear (#4436 ) drop all tables in diskcache. added a unit test but disabled it by default because it will drop all cache...	2024-05-05 14:19:01 -04:00
George Hotz	cb7289f9c9	remove clang program header (#4422 ) * remove clang program header * proper max * bools are numbers * fix compile enet	2024-05-04 08:38:01 -07:00
George Hotz	9fc4465557	subbuffer support (#4397 ) * subbuffer support * diskbuffer offset * cuda subbuffer works * use subbuffer * more subbuffer tests * consecutive * cast * consec * offset * view is a better name * offset is in nbytes * fix view + memory planner * delete unused DiskRunner * reverse order * no subbuffers on unrealized consts * only enabled for disk * don't reverse memory * view supported devices * pickle buffer view * ring jit * support extra view inputs in jit * fix JIT=2 issue * test copy jit * p2p isn't an option anymore * fix dep tracking issue * fix mypy * fix pickle * from_nv is contents now	2024-05-03 18:05:57 -07:00
George Hotz	2786dff26d	new disk tensor tests (#4393 )	2024-05-02 08:54:44 -07:00
George Hotz	bd49d2854a	hotfix: skip fetch tests always	2024-05-01 08:43:26 -07:00
George Hotz	27ee49bf30	tensor variable (#4362 ) * tensor variable support * consttype without variable? * __setitem__ * symbolic mean works * arange test * more tests * a few more tests	2024-04-30 14:08:57 -07:00
George Hotz	d325be2540	update docs (#4356 ) * update docs * nn.md * mnist cleanups * rhip test is very slow	2024-04-30 16:51:42 +09:00
Obada Khalili	e4befa41d7	Fix in `_reshape_mask` (#4332 ) * handle reshape with remainder in _reshape_mask * remove trailing whitespce * use helper_test_op to generate tensors from shapes * test in shapetracket too * remove whitespace * revert property name in other class tests	2024-04-28 11:57:39 -04:00
George Hotz	b6e7243bfa	hotfix: skip slow pre-commit test	2024-04-16 11:48:43 +04:00
chenyu	f6c8032e5d	assert if expr_idxs return might be outside of int32 (#4157 )	2024-04-12 14:18:35 -04:00
uuuvn	8a40d7d423	Shape changing bitcast and assert bitcast in disk (#3973 ) * Shape changing bitcast * only support it on disk * basic test * more tests * RuntimeError instead of assert * create unique temp files * move tests that use disk to test_disk_tensor * linter * remove assert on error messages * that's RuntimeError now --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-03-28 21:49:10 -07:00
chenyu	519336cfea	factor out partial in SumNode div int (#3841 ) * factor out partial in SumNode div int * div not rem * space	2024-03-20 16:34:33 -04:00
chenyu	455f7bea9b	test example from half resnet that idx has number outside of int32 (#3838 ) * test example from half resnet that idx has number outside of int32 * ruff	2024-03-20 13:44:20 -04:00
Patrick Tsai	b436c9792f	Fix factoring bug (O(n) arange related) (#3817 ) * Factoring bug * Another one in case * It works now so change tests back * large arange cumsum optimization * More cleanup * symbolic no factor div test * name change * Rename test --------- Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>	2024-03-19 11:49:42 -04:00
wozeparrot	a0ab755317	threefry again (#3785 ) * feat: initial xor * feat: initial threefly * feat: remove custom random * fix: really need to install precommit * feat: lmao forgot that this is rotate not a shift * clean: put that there * feat: numpy xor * feat: quick test for xor * feat: llvm xor * feat: slightly working xor in torch * feat: rand works in jit * clean: save a line * feat: match jax * feat: maybe test against jax * feat: requires_grad * fix: fix test_symbolic_ops * feat: lower alpha * feat: just pad * fix: maybe fix training tests? * fix: fix some llvm stuff * feat: cursed realize on the way out * feat: testing jax * fix: why is the jax install process not simple * fix: maybe passing test * fix: symbolic workarounds * clean: still need that precommit * fix: aaaa * fix: more test fixes * fix: quick fix for wgsl * feat: need to set requires_grad on the final tensor * feat: one more tensor * feat: don't take forever * feat: seeing y ci is brok * feat: can't allocate 64GiB lmao * fix: fix this * feat: hope this doesn't break smth before i go to bed * feat: don't destroy ram * feat: int * feat: remove jax * feat: properish workaround? * feat: skip slow webgpu tests * feat: no longer fails * feat: use dtypes * feat: real number * fix: torch * fix: don't test against reference for torch * feat: to device * feat: fix advanced indexing * feat: correct casting * feat: even rng_counter * feat: match master * feat: this was actually bad * fix: maybe? * feat: store * feat: remove realizes * feat: somehow this is important * feat: somehow this is also important * feat: save a line * fix: don't need that anymore * feat: restore this * fix: linter * feat: remove realizes * fix: realized is in base now * fix: add back cast * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: :( * fix: :( * fix: not being dumb * feat: try changing less tests * feat: shouldn't have to change that * feat: contiguous bumps it by one * fix: hmm * fix: numpy memory moment * fix: cl_khr_fp16 * fix: torch has different tensor count * fix: missing contiguous * hmm: hmm * fix: some fixes * fix: typing * feat: dont do that * feat: typing fixes * feat: why is this realize required? * feat: ngl kinda odd typing * feat: oh * feat: remove realizes * feat: why is this realize required? * fix: hacky patch for cudacpu * fix: without this realize pytest crashes????? * fix: shorter line * fix: cudacpu fixes * fix: cudacpu fixes * feat: real buffer * feat: don't search when searching lmao * fix: can't use contiguous things * fix: no more 100GB arrays * fix: revert * fix: skip 7 and 10 * feat: working ish beam * feat: minimize changes * feat: seed 0 stable diffusion example changed * fix: different on ci * fix: no beam * feat: make threefry optional * fix: check value * fix: unused import * feat: threefry default * fix: 5d * feat: allow non upcast div * fix: 5d better * fix: 5d better * fix: save all dtype * feat: proper error * feat: lazyop key * fix: check float * feat: try removing this realize now * feat: disable threefry for uops hip tensor cores * feat: don't need that * feat: only check upcast * fix: disable threefry for some metal tests * feat: disable for metal tensor uops as well * feat: disable for most uops * fix: disable threefry for new uops tests * feat: multitensor * fix: typing * feat: threefry default off * feat: skip threefry half rand * feat: restore old * fix: bad git * clean: ruff * feat: bfloat16 fix * fix: :\| * feat: restore old --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-03-18 16:47:07 -04:00
chenyu	639bd5dbfc	move bf16 cast hack to Tensor.llvm_bf16_cast (#3788 )	2024-03-17 18:51:22 -04:00
George Hotz	311cf2b7d3	Revert "threefry_2x32 (#2601 )" (#3784 ) This reverts commit `db3de54bc4`.	2024-03-17 10:27:20 -07:00
wozeparrot	db3de54bc4	threefry_2x32 (#2601 ) * feat: initial xor * feat: initial threefly * feat: remove custom random * fix: really need to install precommit * feat: lmao forgot that this is rotate not a shift * clean: put that there * feat: numpy xor * feat: quick test for xor * feat: llvm xor * feat: slightly working xor in torch * feat: rand works in jit * clean: save a line * feat: match jax * feat: maybe test against jax * feat: requires_grad * fix: fix test_symbolic_ops * feat: lower alpha * feat: just pad * fix: maybe fix training tests? * fix: fix some llvm stuff * feat: cursed realize on the way out * feat: testing jax * fix: why is the jax install process not simple * fix: maybe passing test * fix: symbolic workarounds * clean: still need that precommit * fix: aaaa * fix: more test fixes * fix: quick fix for wgsl * feat: need to set requires_grad on the final tensor * feat: one more tensor * feat: don't take forever * feat: seeing y ci is brok * feat: can't allocate 64GiB lmao * fix: fix this * feat: hope this doesn't break smth before i go to bed * feat: don't destroy ram * feat: int * feat: remove jax * feat: properish workaround? * feat: skip slow webgpu tests * feat: no longer fails * feat: use dtypes * feat: real number * fix: torch * fix: don't test against reference for torch * feat: to device * feat: fix advanced indexing * feat: correct casting * feat: even rng_counter * feat: match master * feat: this was actually bad * fix: maybe? * feat: store * feat: remove realizes * feat: somehow this is important * feat: somehow this is also important * feat: save a line * fix: don't need that anymore * feat: restore this * fix: linter * feat: remove realizes * fix: realized is in base now * fix: add back cast * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: :( * fix: :( * fix: not being dumb * feat: try changing less tests * feat: shouldn't have to change that * feat: contiguous bumps it by one * fix: hmm * fix: numpy memory moment * fix: cl_khr_fp16 * fix: torch has different tensor count * fix: missing contiguous * hmm: hmm * fix: some fixes * fix: typing * feat: dont do that * feat: typing fixes * feat: why is this realize required? * feat: ngl kinda odd typing * feat: oh * feat: remove realizes * feat: why is this realize required? * fix: hacky patch for cudacpu * fix: without this realize pytest crashes????? * fix: shorter line * fix: cudacpu fixes * fix: cudacpu fixes * feat: real buffer * feat: don't search when searching lmao * fix: can't use contiguous things * fix: no more 100GB arrays * fix: revert * fix: skip 7 and 10 * feat: working ish beam * feat: minimize changes * feat: seed 0 stable diffusion example changed * fix: different on ci * fix: no beam * feat: make threefry optional * fix: check value * fix: unused import * feat: threefry default * fix: 5d * feat: allow non upcast div * fix: 5d better * fix: 5d better * fix: save all dtype * feat: proper error * feat: lazyop key * fix: check float * feat: try removing this realize now * feat: disable threefry for uops hip tensor cores * feat: don't need that * feat: only check upcast * fix: disable threefry for some metal tests * feat: disable for metal tensor uops as well * feat: disable for most uops * fix: disable threefry for new uops tests * feat: multitensor * fix: typing * feat: threefry default off * feat: skip threefry half rand * feat: restore old * fix: bad git * clean: ruff * feat: bfloat16 fix * fix: :\| --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-03-17 10:19:33 -07:00
George Hotz	53adcb34f5	remove hip backend (#3783 ) * remove hip backend * remove unused * rhip * more RHIP	2024-03-17 10:12:16 -07:00
chenyu	a2d3cf64a5	move is_dtype_supported to test.helpers (#3762 ) * move is_dtype_supported to test.helpers updated all places that check if float16 is supports * fix tests	2024-03-15 14:33:26 -04:00
George Hotz	69ca7f7bf9	changes for teenygrad (#3665 ) * changes for teenygrad * upd * simpler test	2024-03-09 15:30:34 -08:00
chenyu	8f10bfa2ff	ban __bool__ on Tensor (#3632 ) * ban __bool__ on Tensor avoid misuse * test case * fix tests * fix more tests	2024-03-06 17:12:35 -05:00
chenyu	968d109453	apply more create_lt_node (#3597 ) updated one in linearizer if condition, and various symbolic tests	2024-03-03 16:12:39 -05:00
Patrick Tsai	0082300a59	Fix symbolic negative floordiv (#3594 ) Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>	2024-03-03 11:40:52 -05:00
chenyu	e09619ab6c	explicitly create_lt_node when used in shapetracker _expr_view (#3561 ) * explicitly create_lt_node when used in shapetracker leave regular __lt__ and cmps for symbolic shape cmp * hmm it fixed that? * LtNode.substitute uses create_lt_node	2024-03-03 10:08:21 -05:00
Marcin Słowik	f90caa4b92	Escape table name in diskcache queries. (#3543 ) Some devices create cache table names with non-alphanumerical characters, e.g. "compile_hip_gfx1010:xnack-_12". This commit escapes the table name in single quotes s.t. sqlite works (see https://github.com/tinygrad/tinygrad/issues/3538).	2024-02-29 13:04:21 -08:00
George Hotz	48918fa75a	fix disktensor offset issue (#3532 )	2024-02-28 17:22:17 -08:00
chenyu	0c6846f9fc	failed test case for disk tensor assign into dtype int64 (#3527 ) failed case for #3510, mark as expectedFailure for now	2024-02-28 17:52:21 -05:00
chenyu	88939c3347	fix Node.max can be symbolic (#3514 ) Also made sure taking max twice can get int.	2024-02-27 17:21:31 -05:00
chenyu	61605ccc69	Remove special case of SumNode div SumNode (#3502 )	2024-02-26 09:42:06 -05:00
George Hotz	871ba73e65	_reduce_op is axis based now (#3462 ) * _reduce_op is axis based now * axis_ * update lin failures * disable that * fix shape	2024-02-21 16:36:31 +01:00
chenyu	0d326a48b8	fix LtNode simplification when lhs and rhs contain same variables (#3451 ) * fix LtNode simplification when lhs and rhs contain same variables `(Variable("a", 1, 5) < Variable("a", 1, 5))` should eval to `NumNode(0)` * fix with less perf impact	2024-02-20 09:06:55 -05:00
chenyu	2da734920e	use __getnewargs__ to fix unpickling Variable (#3441 ) it's recommended to use __getnewargs__ to update the args of classes that use __new__ when unpickling. It's preferred because it does not change the __new__ behavior.	2024-02-18 10:28:37 -05:00
xarkes	28a8b72024	Remove Interpreted device & remaining CPU/TORCH ref (#3423 ) * Remove Interpreted device & remaining CPU/TORCH ref * Oops * supports_device was useful * Fix doc wording --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-02-16 00:30:21 -05:00
George Hotz	93eceef727	remove cpu prereqs (#3410 )	2024-02-15 13:45:06 +01:00
Jyotirmaya Mahanta	b6a2600c86	fix merging condition in merge_dims (#3363 ) * fix merging condition in merge_dims * add tests * set contiguous after mask is canonicalized * minor fix	2024-02-12 11:50:26 +01:00
chenyu	97275101e9	fix safetensor load uint32 and uint64 (#3315 ) the correct keys are U32 and U64.	2024-02-04 10:46:27 -05:00
Yoshinori Sano	edb74897b2	support safe load bf16 (#3310 ) * support safe load bf16 * fix lint error E501 * add test for loading safetensors * key should be BOOL * fix lint	2024-02-04 10:08:39 -05:00
chenyu	d459956966	move TestGetContraction to test_helpers (#3313 ) also cleaned long lines in test_shapetracker and enabled the line length check	2024-02-04 06:05:01 -05:00
Felix Wu	021eea3a52	fix UnboundLocalError when running Compiler with DISABLE_COMPILER_CACHE (#3296 )	2024-02-01 21:12:33 -05:00
David Hou	3378625773	name upcast variables (#3200 ) * name upcast variables * typing * unused	2024-01-22 11:37:28 -05:00
George Hotz	d2aab65958	remove unused expr node (#3170 ) * remove unused expr node * still works * simple expr_idxs * fixup typing	2024-01-18 14:18:43 -08:00

... 14 15 16 17 18 ...

952 Commits