tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
George Hotz	8e8fec408e	fix n^2 _apply_map_to_tensors [pr] (#13443 ) * clean up slow rules * fix rule * non n^2 toposort * topovisit * state dict profile_marker	2025-11-24 18:59:16 -08:00
George Hotz	cc5e6323ac	stable diffusion profiling (#13441 ) * stable diffusion profiling Signed-off-by: George Hotz <geohot@gmail.com> * profile_marker * profile per step * fix slow Context * profile that --------- Signed-off-by: George Hotz <geohot@gmail.com>	2025-11-24 15:25:45 -08:00
Sieds Lykles	1e93d19ee3	stable diffusion --fakeweights (#12810 )	2025-10-20 12:41:06 +02:00
George Hotz	af4479c169	faster stable diffusion load (#12725 ) * faster stable diffusion load * failing tests	2025-10-16 18:31:59 +08:00
George Hotz	6e6059dde0	clean up stable diffusion weight loading (#12452 )	2025-10-09 11:13:11 +08:00
hooved	0f804c9a83	Stable Diffusion model init for mlperf (#12314 ) * include clip pr diff * updated unet and sd init * dehardcode default device * revert beam hang workaround --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-10-02 02:28:41 -04:00
Sieds Lykles	5b73076e48	assert benchmark times (#12042 ) * assert jitted times in openpilot * better error * better error * add ASSERT_MIN_STEP_TIME to more models * t is step_times * update benchmark times * update times	2025-09-09 23:40:02 +02:00
George Hotz	b3b43a82c4	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00
wozeparrot	1ed04f993b	move benchmark stat tracking to influxdb (#10185 )	2025-05-15 16:14:56 -07:00
Ahmed Harmouche	8909dbd82c	Remove wgpu specific checks from stable diffusion example (#7991 )	2024-12-02 11:31:14 +01:00
chenyu	69e382216d	fix wino conv output dtype for half inputs (#7829 )	2024-11-21 12:13:54 -05:00
George Hotz	dc3148c677	hotfix: minor speed increase + stable diffusion relax	2024-10-25 16:27:21 +08:00
George Hotz	7e73c7b3cc	hotfix: bump stable diffusion val distance	2024-09-26 11:15:29 +08:00
wozeparrot	c100f3d406	default threefry (#6116 )	2024-09-25 17:45:13 +08:00
chenyu	c9a9631818	no UnaryOps.NEG in generated UOp patterns (#6209 ) * no UnaryOps.NEG in generated UOp patterns removed pattern `x * (-1) -> -x` and `x != True` * those are fine because NEG became CMPNE and True * fix sd validation L2 norm	2024-08-21 11:08:22 -04:00
Tobias Fischer	8c9c1cf62f	Pulled CLIP and UNet into Seperate Files (#5253 ) * pulled clip and unet into seperate files * reference cleanup, lru cache fix * better pool indexing	2024-07-01 22:33:01 -04:00
chenyu	b9122ecdaf	revert stable diffusion validation with threefry (#5248 ) * Revert "use threefry in stable diffusion benchmark (#4988)" This reverts commit `44dfa37c70`. * sdxl and validation fix * relax threshold	2024-07-01 14:43:47 -04:00
chenyu	88763eb9ff	fix stable_diffusion with fp16 (#5239 )	2024-06-30 12:59:31 -04:00
Tobias Fischer	4688f97d48	Add SDXL Inference to Examples (#5206 ) * added sdxl inference code * fixed trailing whitespace * use original impl code, removed uneeded numpy calls	2024-06-28 07:42:28 -04:00
chenyu	0ba093dea0	hotfix: only validate stable diffusion when using threefry (#5166 )	2024-06-26 16:50:38 -04:00
chenyu	e4a5870b36	validate stable_diffusion output (#5163 ) changed default steps, forgot to update validation	2024-06-26 16:42:21 -04:00
chenyu	e356807696	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
chenyu	44dfa37c70	use threefry in stable diffusion benchmark (#4988 ) also updated default steps to 10. easier to tell the image is following the prompt.	2024-06-15 20:25:29 -04:00
chenyu	fd249422f5	minor cleanup example stable_diffusion (#4753 )	2024-05-28 00:05:37 -04:00
chenyu	30fc1ad415	remove TODO: remove explicit dtypes after broadcast fix in stable_diffusion (#4241 ) this is done	2024-04-21 00:31:24 -04:00
chenyu	c71627fee6	move GlobalCounter to helpers (#4002 ) break circular import between ops and buffer	2024-03-30 00:30:30 -04:00
George Hotz	150ea2eb76	create engine folder and move code (#3948 ) * retry * older tf * that	2024-03-26 20:38:03 -07:00
George Hotz	3527c5a9d2	add Tensor.replace (#3738 ) * add Tensor.replace * fix dtypes in that test * should be replace * and mixtral	2024-03-14 13:34:14 -07:00
rnxyfvls	490c5a3ec3	examples/stable_diffusion: support model checkpoints without alphas_cumprod key (#3681 ) * examples/stable_diffusion: support model checkpoints without alphas_cumprod key (which is most models on civitai) * fix indent --------- Co-authored-by: a <a@a.aa>	2024-03-11 16:05:52 -04:00
George Hotz	41efaa848c	move graph.py and jit.py into features (#3376 ) * move graph.py into features * move jit into features * fix quickstart	2024-02-12 17:34:34 +01:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
George Hotz	c81ce9643d	move globalcounters to ops (#2960 ) * move globalcounters to ops * missed a few * sick of that failing	2024-01-01 14:21:02 -08:00
chenyu	7dc3352877	increase stable diffusion validation threshold 1e-4 -> 3e-4 (#2897 ) saw a flaky CI failure with 1.1e-4, and 3e-4 is a good number	2023-12-21 11:45:25 -05:00
chenyu	a044125c39	validate stable diffusion for seed 0 (#2773 ) * validate stable diffusion for seed 0 the closest false positive i can get is with the setup and one less step. dist = 0.0036 same setup with fp16 has dist=5e-6. so setting validation threshold to 1e-4 should be good * run with --seed 0	2023-12-15 00:07:09 -05:00
chenyu	9afa8009c1	hot fix explicitly set arange dtype to float (#2772 )	2023-12-14 23:14:38 -05:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
chenyu	5ef8d682e3	clean up attentions in stable diffusion (#2275 )	2023-11-11 14:25:36 -05:00
Ahmed Harmouche	265304e7fd	Stable diffusion WebGPU port (#1370 ) * WIP: Stable diffusion WebGPU port * Load whole model: split safetensor to avoid Chrome allocation limit * Gitignore .DS_Store, remove debug print * Clip tokenizer in JS * WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS * e2e stable diffusion flow * Create initial random latent tensor in JS * SD working e2e * Log if some weights were not loaded properly * Remove latent_tensor.npy used for debugging * Cleanup, remove useless logs * Improve UI * Add progress bar * Remove .npy files used for debugging * Add clip tokenizer as external dependency * Remove alphas_cumprod.js and load it from safetensors * Refactor * Simplify a lot * Dedup base when limiting elementwise merge (webgpu) * Add return type to safe_load_metadata * Do not allow run when webgpu is not supported * Add progress bar, refactor, fix special names * Add option to chose from local vs huggingface weights * lowercase tinygrad :) * fp16 model dl, decompression client side * Cache f16 model in browser, better progress * Cache miss recovery --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-03 18:29:16 -07:00
George Hotz	6dc8eb5bfd	universal disk cache (#2130 ) * caching infra for tinygrad * nons tr key * fix linter * no shelve in beam search * beam search caching * check tensor cores with beam too * pretty print * LATEBEAM in stable diffusion	2023-10-22 10:56:57 -07:00
Ahmed Harmouche	0d3410d93f	Stable diffusion: Make guidance modifiable (#2077 )	2023-10-15 14:36:43 -07:00
Ahmed Harmouche	e27fedfc7b	Fix stable diffusion output error on WebGPU (#2032 ) * Fix stable diffusion on WebGPU * Remove hack, numpy cast only on webgpu * No-copy numpy cast	2023-10-10 06:40:51 -07:00
George Hotz	adab724caa	schedule2, keep the tests working with small changes (#1932 ) * lazy cleanups * ast functions take in LazyOps * op instead of self.op * _base for mops * fix contiguous * start schedule * test_schedule * fix openpilot * more tests * bugfix and test skip * work * make sure things get freed * fix zerosized tensors * fix failing test * fix ceil and friends * fix openpilot * disable training * disable test collectives	2023-09-28 09:14:43 -07:00
Dat D. Nguyen	ae9529e678	chore: remove redundant noise in stable diffusion example (#1910 )	2023-09-24 21:33:45 +08:00
segf00lt	9e8c1dbf34	patch to remove hack from stable_diffusion.py (#1814 ) * patch to remove hack from stable_diffusion.py * sorry linter * realize after assign? * float16 broken in llvmlite use float64 for now * int32 * idiot forgot to change test array dtype	2023-09-08 09:26:50 -07:00
George Hotz	722823dee1	stable diffusion: force fp16 free	2023-09-06 15:11:05 -07:00
Francis Lam	0379b64ac4	add seed option to stable_diffusion (#1784 ) useful for testing correctness of model runs	2023-09-05 19:45:15 -07:00
Karan Handa	a8aa13dc91	[ready] Replacing os with pathlib (#1708 ) * replace os.path with pathlib * safe convert dirnames to pathlib * replace all os.path.join * fix cuda error * change main chunk * Reviewer fixes * fix vgg * Fixed everything * Final fixes * ensure consistency * Change all parent.parent... to parents	2023-08-30 10:41:08 -07:00
Umut Zengin	1682e9a38a	Fix: Stable Diffusion index (#1713 )	2023-08-30 00:21:10 -04:00
George Hotz	aa7c98722b	sd timing (#1706 )	2023-08-28 20:22:57 -07:00

1 2 3

121 Commits