tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-22 13:28:06 -05:00

Author	SHA1	Message	Date
geohotstan	f8056a74d6	combine pad2d with pad (#7677 ) * I have pad2d, I have pad, uuh~, pad2dpad~ * fix some small things * strategically placed cast hack * fix more * fix more more * tests * periods	2024-11-14 17:56:02 +08:00
qazal	e84d089ef1	delete ReduceOps, only use REDUCE_AXIS (#7667 )	2024-11-13 19:04:27 +08:00
qazal	e07d2d0966	skip TestBeamSearch.test_large_ast (#7652 )	2024-11-12 20:52:22 +08:00
chenyu	035e39f900	remove copied is_dtype_supported from onnx [pr] (#7646 )	2024-11-11 19:20:32 -05:00
Ahmed Harmouche	9c63c3d8ab	These casts should only happen if these are supported (#7644 )	2024-11-12 07:56:50 +08:00
nimlgen	4d81b7952a	qcom match texture/sampler descriptors to OpenCL (#7622 ) * qcom ioctl compare more regs * bug fix	2024-11-11 21:56:51 +03:00
uuuvn	94a484542b	Hook memoryview via class instead of a function (#7627 )	2024-11-11 09:07:06 +08:00
chenyu	e7b18cf5c0	fix load_worlds filter_novariable (#7564 ) filter based on "DEFINE_VAR" instead of "Variable". also added a unit test to make sure dataset includes image and variable kernels	2024-11-05 16:06:39 -05:00
chenyu	207bca6cea	set PAGE_SIZE=1 and generate new dataset (#7559 ) 13080 rows in total. both generating and loading this are pretty broken now. filters are wrong for example	2024-11-05 11:25:01 -05:00
chenyu	7581a57aac	show the actual dataset size in error message (#7557 )	2024-11-05 09:16:30 -05:00
chenyu	0db5f52b2a	check `datasets/sops.gz` size to be > 5000 (#7555 ) it has > 12000 rows now, but it depends on the backend that generates these so setting a lower but meaningful threshold	2024-11-05 09:03:19 -05:00
chenyu	e641bbc859	safe softmax trick in MCTS ucb_explored_children (#7515 ) * safe softmax trick in MCTS ucb_explored_children fixed ``` File "numpy/random/mtrand.pyx", line 971, in numpy.random.mtrand.RandomState.choice ValueError: probabilities contain NaN ``` when all ucb_explored_children are big negative numbers result in all NaN probabilities * better type	2024-11-03 15:59:31 -05:00
George Hotz	c8bf09b7d4	s/UOps/Ops (#7500 ) * s/UOps/Ops [pr] * fix	2024-11-03 11:26:10 +08:00
chenyu	fb694a63eb	Tensor.erf (#7419 ) the same one used in onnx and the one in bert.	2024-10-30 18:12:28 -04:00
eliotgolding	e920f1d663	Llama 3.2 1B load from GGUF (#7295 ) * gguf 1b-instruct * not needed	2024-10-27 09:29:02 +08:00
nimlgen	68cd2c0669	nv correct local memory based on device (#7307 ) * nv correct local memory based on device * linter * oops * oops2	2024-10-25 22:23:42 +03:00
nimlgen	ea11382087	nv fix shared_memory_size (#7239 )	2024-10-23 21:59:47 +03:00
qazal	aeeb917b6e	mask out writable bufs in runtime access_resources (#7234 )	2024-10-23 16:13:50 +03:00
George Hotz	b0a13896d7	PtrDType is dataclass [pr] (#7125 ) * PtrDType is dataclass [pr] * new dataset --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-10-18 09:40:33 -04:00
nimlgen	45db7d9045	fuzz qcom vs opencl (#7130 ) * fuzz qcom vs opencl * fix nv * bettre? * typo * open both devs	2024-10-17 18:49:08 +03:00
George Hotz	3169cb386d	remove graph [pr] (#7085 )	2024-10-16 11:40:07 +08:00
nimlgen	b025495e5c	fuzz nv vs cuda (#7066 ) * fuzz nv vs cuda * fixes * smth * um * cmp the same * dnrt * correct gpfifo scan * fix	2024-10-15 22:22:40 +03:00
qazal	8ff6514ba3	delete extra/ops.py [pr] (#7072 )	2024-10-15 22:14:21 +03:00
nimlgen	586ff4c910	nv record uvm mappings (#7059 ) * nv record uvm mappings * linteeer * smth * ooops	2024-10-15 00:12:49 +03:00
nimlgen	8094340221	nv print info about faults (#7057 ) * nv print info about faults * unrelated changes * nv_gpu.GT200_DEBUGGER in mockgpu * regen with ocrrect version * spacing	2024-10-14 21:49:38 +03:00
chenyu	bd8ecf7fd6	remove NumNode (#7035 )	2024-10-13 16:42:19 -04:00
chenyu	c4c806a210	generate new kernel dataset (#7034 ) * generate new kernel dataset pre req to remove NumNode ``` extra/optimization/generate_dataset.sh gzip -k /tmp/sops mv /tmp/sops.gz extra/datasets/ ``` * fix var range in fuzz_linearizer	2024-10-13 16:19:41 -04:00
qazal	13846930cd	hotfix: extract_dataset.py (#7029 )	2024-10-13 11:18:23 +03:00
George Hotz	a71bb09ec3	remove symbolic file [pr] (#7012 )	2024-10-12 18:44:44 +08:00
George Hotz	5ae2de9845	UOp.variable (#7010 ) * UOp.variable [pr] * fix tests * clean * improve name rendering * last bug	2024-10-12 18:20:44 +08:00
qazal	20d3c2d113	unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW (#6955 ) * add UOps.VIEW * update hardcoded asts * update sops.gz	2024-10-09 02:00:17 +08:00
Tobias Fischer	f9e32f2bb2	clip device fix (#6924 )	2024-10-07 00:47:32 +08:00
chenyu	01a2d7316d	dtype=float in bert log_softmax for loss and accuracy (#6916 )	2024-10-06 11:15:56 -04:00
George Hotz	4df5c7a4ef	move lazy to engine [pr] (#6886 ) * move lazy to engine [pr] * engine.lazy	2024-10-04 23:19:26 +08:00
George Hotz	8ca506ee37	remove the magic methods for moving between devices [pr] (#6881 ) * remove the magic methods for moving between devices [pr] * remove unneeded clang	2024-10-04 20:27:52 +08:00
chenyu	7c8849010a	fix var_vals in MCTS (#6882 ) tested with JITBEAM=100 llama	2024-10-04 08:19:35 -04:00
George Hotz	a0cb16ac61	node cleanup + local metal test speed [pr] (#6880 ) * node cleanup [pr] * fix tests, including the double one on metal * no time tqdm tests	2024-10-04 18:14:23 +08:00
George Hotz	cdff1d75b6	things that are only used in one place don't belong in helpers [pr] (#6878 ) * things that are only used in one place don't belong in helpers [pr] * pretty print moved	2024-10-04 17:27:38 +08:00
George Hotz	f4ec39fe58	switch symbolic from old to uops, final PR (#6872 ) * switch symbolic from old to uops, final PR * two wrong answers * not needed resolves * symbolic ops passes * symbolic ops passes * progress * tests pass (almost) * fix last test * fix some tests * global binding and unbinding * Revert "global binding and unbinding" This reverts commit `9456725630`. * that test works now * vars on uop doesn't recurse * fix fuzzer * update * fix type * fix gpt, it's UOp now * ssimplify symbolics	2024-10-04 16:42:27 +08:00
chenyu	c3c93f332a	symbolic bool raise ValueError when not sure [pr] (#6853 )	2024-10-02 09:10:58 -04:00
Tobias Fischer	33f7599158	Compute FID Score (#6802 ) * compute fid score code * cleaner s1 and m1 loading	2024-10-01 19:47:58 -04:00
Francis Lata	d3a387be63	[MLPerf] Prepare openimages dataset script (#6747 ) * prepare openimages for MLPerf * cleanup * fix issue when clearing jit_cache on retinanet eval * revert pandas specific changes	2024-09-27 11:13:56 -04:00
nimlgen	3c56aeee70	add Tensor.from_blob (#6765 ) * draft tensor from pointer init * some docs and types * comment * cleaner * test * malloc * qcom cl interop * jit example * cleaner * dealoc * wording * docs	2024-09-26 18:33:19 +08:00
chenyu	396c96357b	update mlperf bert scripts (#6755 ) removed DISABLE_DROPOUT=1. updated BS to 54 that works on tinyboxes with dropouts. used bert's sparse_categorical_crossentropy that takes Tensor ignore_index in accuracy method	2024-09-25 23:55:05 -04:00
wozeparrot	4ebc9589a6	feat: make buffer (#6745 )	2024-09-25 18:31:03 +08:00
nimlgen	56979aa3ed	qcom ioctl log levels (#6735 )	2024-09-25 14:59:27 +08:00
wozeparrot	2be0b26a1f	rand only supports single device (#6682 )	2024-09-24 16:07:44 +08:00
nimlgen	ca66b11e07	qcom fix disasm (#6703 )	2024-09-24 15:23:43 +08:00
samm393	19c11792fd	Flux.1 (#6334 ) * initial commit * whitespace * get rid of torch import * indentation * less hardcoding * add flux.1-dev * jit * no double * t5 tidy up * validation image * reuse sdxl autoencoder * typing changes * empty lines * remove unneeded comments --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-09-24 10:08:04 +08:00
chenyu	31b9c74c77	tiny import cleanup and fix typo (#6692 )	2024-09-23 21:48:23 -04:00

... 7 8 9 10 11 ...

1242 Commits