tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 22:08:08 -05:00

Author	SHA1	Message	Date
wozeparrot	90f0e2fc49	db in wal mode (#5388 )	2024-07-12 20:43:36 -07:00
George Hotz	03c2dc8bd7	lowerer is kernel [run_process_replay] (#5437 )	2024-07-12 18:50:55 -07:00
George Hotz	870dc8c350	s/Linearizer/Lowerer [run_process_replay] (#5428 )	2024-07-12 15:54:07 -07:00
George Hotz	6707c778d0	scheduleitem is not Tuple [run_process_replay] (#5425 ) * scheduleitem is not Tuple [run_process_replay] * fix tests * fix op + fuzzers * fix mop test	2024-07-12 15:13:19 -07:00
George Hotz	d13654a820	move uopgraph to file [run_process_replay] (#5364 ) * move uopgraph to file [run_process_replay] * fix print tree test	2024-07-10 17:34:50 -07:00
chenyu	649641a2f2	fix tqdm with generator without `__len__` (#5238 ) it should be treated as total = 0 (just show iteration count). also removed duplicated ": " in fetch and fixed unit scale with total = 0	2024-06-30 12:20:59 -04:00
chenyu	fd53b6d901	tqdm supports fractional blocks (#5233 ) enabled progress bar match in test, it matched perfectly now	2024-06-29 22:30:18 -04:00
chenyu	ae10ae4722	simplify tqdm scale math (#5231 ) expand the log of log stuff	2024-06-29 21:17:40 -04:00
chenyu	b2ea610df8	fix tqdm unit_scale and support hours in time (#5227 ) * fix tqdm unit_scale and support hours in time previously it only supports MM:SS. more chars to unitscales, strip trailing "." and " " in formatting, and more tests * simpler	2024-06-29 14:48:51 -04:00
chenyu	42d1f92fc1	simpler tqdm (#5221 ) can do more, but many cases are not tested	2024-06-29 07:41:46 -04:00
Roelof van Dijk	9704c7d4d4	ruff rule if-exp-instead-of-or-operator (FURB110) (#5178 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-06-27 08:22:19 -07:00
Roelof van Dijk	975b811ad9	names shadowing builtins (#5179 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-06-27 08:15:01 -04:00
chenyu	33211f356b	fix desc in tqdm (#5107 ) per doc `https://tqdm.github.io/docs/tqdm/`, user does not need to put `: ` in desc, and `: ` is automatically removed after desc if the latter is empty. updated test cases and added a test for set_description	2024-06-22 19:00:38 -04:00
chenyu	e356807696	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
chenyu	8080298739	s/tinytqdm/tqdm (#5103 ) except in unit test where tqdm is imported	2024-06-22 14:18:26 -04:00
nimlgen	fb1bf48cfe	io_uring for copies from disk (#5035 ) * exp uring * fixes and old version * nv * cleaner * cmp vs aio * fix * no lib * fix nv * linter * disk_speed_test now runs default * fixes * uring -> io_uring * linter happy * get_temp_buf comment added * tiny nits * put wait back * test runs everywhere * remove consts * remove mmap consts * do not require iouring to run test, they are generic	2024-06-21 11:36:51 +03:00
chenyu	4e5add4d01	move test_tqdm to test/unit/ (#5042 )	2024-06-18 17:41:39 -04:00
Junjun Dong	c8cd6e725c	Remove BinaryOps.SUB. Replace SUB by ADD and NEG in all tests. Regenerate dataset (#4977 ) * feat: remove BinaryOps.SUB * remove SUB in test_early_end_local * regenerate dataset. remove SUB in test_linearizer_* * reenable overflow tests * simplify tensor.sub function by returning a+(-b) * remove whitespaces --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-06-18 09:06:13 -04:00
chenyu	acaf9a490d	RECIP(-0.0) should be -inf (#5024 ) * RECIP(-0.0) should be -inf added test_dtype_alu for PYTHON backend * catcht that * fix those two	2024-06-17 22:26:58 -04:00
George Hotz	1d6f1a15e1	add lt and ge uop methods [run_process_replay] (#4995 ) * add lt and ge uop methods [run_process_replay] * more correct (should still run process replay)	2024-06-16 09:33:53 -07:00
George Hotz	dac96f177e	ignore indexing in the flopcounter (#4993 )	2024-06-16 08:59:55 -07:00
chenyu	50bc14d186	re-enable test that loads torch pkl format (#4986 )	2024-06-15 14:11:30 -04:00
wozeparrot	8209cd3c55	easier llama3 + fetch subdir (#4938 )	2024-06-14 13:47:27 -07:00
chenyu	5eee974b2a	construct Tensor from python list/tuple directly (#4947 ) * construct Tensor from python list/tuple directly no numpy. annoying that half memoryview is 3.12 feature... * simpler, and test * flat already * simpler * cute * 10% faster * 5%	2024-06-14 11:36:05 -04:00
Jhenner Tigreros	dc9e9e4363	Convert BinaryOps.DIV to UnaryOps.RECIP and BinaryOps.IDIV (#4887 ) * Create UnaryOps.RECIP and BinaryOps.IDIV and changing uses of BinaryOps.DIV * Delete unused import * Add cstyle renderer * Fix formatting text * Fix test error due to bad implementation of renderer * Add PTX support * Add RECIP to LLVMIR * Remove BinaryOps.DIV from symbolic test * Change some test and fix C floor division * Change references to DIV for the RECIP or IDIV * Add mimic idiv for symbolic test * Restore floor * Mimic idiv * cast to int * Fix some test and renderer * Remove DIV for render nodes * Resolve issue with div * Add TestRenderer * Fix test * fix error * Fix PAD test * Fix div implementation * Remove DIV * Add upcast to rshift, due to use of MUL and RECIP on DIV * Fix linter * Remove complete BinaryOps.DIV * Fix lint * Fix some test * Revert mul modification * Fix tests * Fix CLANG for uops * Revert IDIV function * Minor fix * modify pattern matching rule to support nan * Fix UNSAFE_PADS_OPS to add UnaryOps.RECIP * Remove const folding for IDIV and fix PTX * Complete remove IDIV from extra * Remove test_div from TestFloatUOps due to test on recip * Fix linearizer * fix * Fix test_22 * Fix llvm * Apply trunc function for llvmlit * use floor instead of trunc * Use correct type * Generate new fuzz db * Fix rshift, do not cast to float to support idiv * Return upcast=false to rshift * Add to unsafepad BinaryOps.IDIV * Remove RECIP override for CUDA * add atol / rtol for the test * Remove cast to int on IDIV * Regenerate sops * delete sops.gz * regenerate * regenerate * regenerate * Reduce margins * pass atol and rtol as parametersg for _test_metrics * regenerated dataset * Regenerate * Remove duplicated * Revert changes on extra * Remove changes extra and NOQA for test * Remove E501 * Remove and change line * Remove E501 * Fix atan2 * Revert import and E501 * Remove E501 * Add hrcp to halp ops * Remove 1 of hrcp * Remove last DIV and add type check on uops for IDIV * Fix new tests * Fix tests and custom function * Regenerate dataset * Regenerate dataset * Revert dataset * Change generate dataset script * Remove line * Change IDIV, type checker validate if x,y and z are int --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-06-14 02:43:46 -07:00
George Hotz	9a3c1e4a17	fix mul div failure (#4928 )	2024-06-12 13:58:46 +02:00
George Hotz	11a03cbbf5	don't use uops.add while constructing (#4913 ) * don't use uops.add while constructing * rebase * bugfixes * have to use BFS * prove it's late * simpler uop symbolic test (why we did this) * use dict, not set	2024-06-12 13:31:34 +02:00
George Hotz	b9afb0d577	test uop as symbolic (#4870 ) * start work * more tests passing * more tests passing * more * 34 failures * expect the failures * remove broken rule * render is fine in just the test * simplify and put in test	2024-06-09 12:15:11 +02:00
David Hou	cddce0e168	don't cast before view on shape changing bitcast (#4833 ) * don't cast before view on shape changing bitcast * make sure cast before view triggers	2024-06-04 16:04:52 -04:00
chenyu	1ffa5ec492	unit test ShapeTracker.consecutive (#4800 )	2024-06-01 10:10:51 -04:00
chenyu	59c6472b9f	check contiguous in View.create after canonicalizing mask and offset (#4770 ) mask / offset / strides can change during canonicalization, and contiguous can be True at the end	2024-05-29 11:31:13 -04:00
nimlgen	9b02aef45a	remove rhip (#4579 ) * remove rhip * remove hip runner	2024-05-14 17:58:19 +03:00
chenyu	5e036cd0b3	test unary and more reduces in test_flopcounter (#4455 ) cannot really catch a spec change error without testing the new spec explicitly, but we don't intended to change the lazy spec lightly another possible way to catch reduce flopcounter shape would be type checking InterpretedFlopCounter and throw error if `in` results in `Never`	2024-05-06 15:15:16 -04:00
chenyu	d0eb1540d5	helpers.diskcache_clear (#4436 ) drop all tables in diskcache. added a unit test but disabled it by default because it will drop all cache...	2024-05-05 14:19:01 -04:00
George Hotz	cb7289f9c9	remove clang program header (#4422 ) * remove clang program header * proper max * bools are numbers * fix compile enet	2024-05-04 08:38:01 -07:00
George Hotz	9fc4465557	subbuffer support (#4397 ) * subbuffer support * diskbuffer offset * cuda subbuffer works * use subbuffer * more subbuffer tests * consecutive * cast * consec * offset * view is a better name * offset is in nbytes * fix view + memory planner * delete unused DiskRunner * reverse order * no subbuffers on unrealized consts * only enabled for disk * don't reverse memory * view supported devices * pickle buffer view * ring jit * support extra view inputs in jit * fix JIT=2 issue * test copy jit * p2p isn't an option anymore * fix dep tracking issue * fix mypy * fix pickle * from_nv is contents now	2024-05-03 18:05:57 -07:00
George Hotz	2786dff26d	new disk tensor tests (#4393 )	2024-05-02 08:54:44 -07:00
George Hotz	bd49d2854a	hotfix: skip fetch tests always	2024-05-01 08:43:26 -07:00
George Hotz	27ee49bf30	tensor variable (#4362 ) * tensor variable support * consttype without variable? * __setitem__ * symbolic mean works * arange test * more tests * a few more tests	2024-04-30 14:08:57 -07:00
George Hotz	d325be2540	update docs (#4356 ) * update docs * nn.md * mnist cleanups * rhip test is very slow	2024-04-30 16:51:42 +09:00
Obada Khalili	e4befa41d7	Fix in `_reshape_mask` (#4332 ) * handle reshape with remainder in _reshape_mask * remove trailing whitespce * use helper_test_op to generate tensors from shapes * test in shapetracket too * remove whitespace * revert property name in other class tests	2024-04-28 11:57:39 -04:00
George Hotz	b6e7243bfa	hotfix: skip slow pre-commit test	2024-04-16 11:48:43 +04:00
chenyu	f6c8032e5d	assert if expr_idxs return might be outside of int32 (#4157 )	2024-04-12 14:18:35 -04:00
uuuvn	8a40d7d423	Shape changing bitcast and assert bitcast in disk (#3973 ) * Shape changing bitcast * only support it on disk * basic test * more tests * RuntimeError instead of assert * create unique temp files * move tests that use disk to test_disk_tensor * linter * remove assert on error messages * that's RuntimeError now --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-03-28 21:49:10 -07:00
chenyu	519336cfea	factor out partial in SumNode div int (#3841 ) * factor out partial in SumNode div int * div not rem * space	2024-03-20 16:34:33 -04:00
chenyu	455f7bea9b	test example from half resnet that idx has number outside of int32 (#3838 ) * test example from half resnet that idx has number outside of int32 * ruff	2024-03-20 13:44:20 -04:00
Patrick Tsai	b436c9792f	Fix factoring bug (O(n) arange related) (#3817 ) * Factoring bug * Another one in case * It works now so change tests back * large arange cumsum optimization * More cleanup * symbolic no factor div test * name change * Rename test --------- Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>	2024-03-19 11:49:42 -04:00
wozeparrot	a0ab755317	threefry again (#3785 ) * feat: initial xor * feat: initial threefly * feat: remove custom random * fix: really need to install precommit * feat: lmao forgot that this is rotate not a shift * clean: put that there * feat: numpy xor * feat: quick test for xor * feat: llvm xor * feat: slightly working xor in torch * feat: rand works in jit * clean: save a line * feat: match jax * feat: maybe test against jax * feat: requires_grad * fix: fix test_symbolic_ops * feat: lower alpha * feat: just pad * fix: maybe fix training tests? * fix: fix some llvm stuff * feat: cursed realize on the way out * feat: testing jax * fix: why is the jax install process not simple * fix: maybe passing test * fix: symbolic workarounds * clean: still need that precommit * fix: aaaa * fix: more test fixes * fix: quick fix for wgsl * feat: need to set requires_grad on the final tensor * feat: one more tensor * feat: don't take forever * feat: seeing y ci is brok * feat: can't allocate 64GiB lmao * fix: fix this * feat: hope this doesn't break smth before i go to bed * feat: don't destroy ram * feat: int * feat: remove jax * feat: properish workaround? * feat: skip slow webgpu tests * feat: no longer fails * feat: use dtypes * feat: real number * fix: torch * fix: don't test against reference for torch * feat: to device * feat: fix advanced indexing * feat: correct casting * feat: even rng_counter * feat: match master * feat: this was actually bad * fix: maybe? * feat: store * feat: remove realizes * feat: somehow this is important * feat: somehow this is also important * feat: save a line * fix: don't need that anymore * feat: restore this * fix: linter * feat: remove realizes * fix: realized is in base now * fix: add back cast * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: :( * fix: :( * fix: not being dumb * feat: try changing less tests * feat: shouldn't have to change that * feat: contiguous bumps it by one * fix: hmm * fix: numpy memory moment * fix: cl_khr_fp16 * fix: torch has different tensor count * fix: missing contiguous * hmm: hmm * fix: some fixes * fix: typing * feat: dont do that * feat: typing fixes * feat: why is this realize required? * feat: ngl kinda odd typing * feat: oh * feat: remove realizes * feat: why is this realize required? * fix: hacky patch for cudacpu * fix: without this realize pytest crashes????? * fix: shorter line * fix: cudacpu fixes * fix: cudacpu fixes * feat: real buffer * feat: don't search when searching lmao * fix: can't use contiguous things * fix: no more 100GB arrays * fix: revert * fix: skip 7 and 10 * feat: working ish beam * feat: minimize changes * feat: seed 0 stable diffusion example changed * fix: different on ci * fix: no beam * feat: make threefry optional * fix: check value * fix: unused import * feat: threefry default * fix: 5d * feat: allow non upcast div * fix: 5d better * fix: 5d better * fix: save all dtype * feat: proper error * feat: lazyop key * fix: check float * feat: try removing this realize now * feat: disable threefry for uops hip tensor cores * feat: don't need that * feat: only check upcast * fix: disable threefry for some metal tests * feat: disable for metal tensor uops as well * feat: disable for most uops * fix: disable threefry for new uops tests * feat: multitensor * fix: typing * feat: threefry default off * feat: skip threefry half rand * feat: restore old * fix: bad git * clean: ruff * feat: bfloat16 fix * fix: :\| * feat: restore old --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-03-18 16:47:07 -04:00
chenyu	639bd5dbfc	move bf16 cast hack to Tensor.llvm_bf16_cast (#3788 )	2024-03-17 18:51:22 -04:00
George Hotz	311cf2b7d3	Revert "threefry_2x32 (#2601 )" (#3784 ) This reverts commit `db3de54bc4`.	2024-03-17 10:27:20 -07:00

1 2 3 4 5

227 Commits