tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 22:08:08 -05:00

Author	SHA1	Message	Date
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
George Hotz	c81ce9643d	move globalcounters to ops (#2960 ) * move globalcounters to ops * missed a few * sick of that failing	2024-01-01 14:21:02 -08:00
chenyu	7dc3352877	increase stable diffusion validation threshold 1e-4 -> 3e-4 (#2897 ) saw a flaky CI failure with 1.1e-4, and 3e-4 is a good number	2023-12-21 11:45:25 -05:00
chenyu	a044125c39	validate stable diffusion for seed 0 (#2773 ) * validate stable diffusion for seed 0 the closest false positive i can get is with the setup and one less step. dist = 0.0036 same setup with fp16 has dist=5e-6. so setting validation threshold to 1e-4 should be good * run with --seed 0	2023-12-15 00:07:09 -05:00
chenyu	9afa8009c1	hot fix explicitly set arange dtype to float (#2772 )	2023-12-14 23:14:38 -05:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
chenyu	5ef8d682e3	clean up attentions in stable diffusion (#2275 )	2023-11-11 14:25:36 -05:00
Ahmed Harmouche	265304e7fd	Stable diffusion WebGPU port (#1370 ) * WIP: Stable diffusion WebGPU port * Load whole model: split safetensor to avoid Chrome allocation limit * Gitignore .DS_Store, remove debug print * Clip tokenizer in JS * WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS * e2e stable diffusion flow * Create initial random latent tensor in JS * SD working e2e * Log if some weights were not loaded properly * Remove latent_tensor.npy used for debugging * Cleanup, remove useless logs * Improve UI * Add progress bar * Remove .npy files used for debugging * Add clip tokenizer as external dependency * Remove alphas_cumprod.js and load it from safetensors * Refactor * Simplify a lot * Dedup base when limiting elementwise merge (webgpu) * Add return type to safe_load_metadata * Do not allow run when webgpu is not supported * Add progress bar, refactor, fix special names * Add option to chose from local vs huggingface weights * lowercase tinygrad :) * fp16 model dl, decompression client side * Cache f16 model in browser, better progress * Cache miss recovery --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-03 18:29:16 -07:00
George Hotz	6dc8eb5bfd	universal disk cache (#2130 ) * caching infra for tinygrad * nons tr key * fix linter * no shelve in beam search * beam search caching * check tensor cores with beam too * pretty print * LATEBEAM in stable diffusion	2023-10-22 10:56:57 -07:00
Ahmed Harmouche	0d3410d93f	Stable diffusion: Make guidance modifiable (#2077 )	2023-10-15 14:36:43 -07:00
Ahmed Harmouche	e27fedfc7b	Fix stable diffusion output error on WebGPU (#2032 ) * Fix stable diffusion on WebGPU * Remove hack, numpy cast only on webgpu * No-copy numpy cast	2023-10-10 06:40:51 -07:00
George Hotz	adab724caa	schedule2, keep the tests working with small changes (#1932 ) * lazy cleanups * ast functions take in LazyOps * op instead of self.op * _base for mops * fix contiguous * start schedule * test_schedule * fix openpilot * more tests * bugfix and test skip * work * make sure things get freed * fix zerosized tensors * fix failing test * fix ceil and friends * fix openpilot * disable training * disable test collectives	2023-09-28 09:14:43 -07:00
Dat D. Nguyen	ae9529e678	chore: remove redundant noise in stable diffusion example (#1910 )	2023-09-24 21:33:45 +08:00
segf00lt	9e8c1dbf34	patch to remove hack from stable_diffusion.py (#1814 ) * patch to remove hack from stable_diffusion.py * sorry linter * realize after assign? * float16 broken in llvmlite use float64 for now * int32 * idiot forgot to change test array dtype	2023-09-08 09:26:50 -07:00
George Hotz	722823dee1	stable diffusion: force fp16 free	2023-09-06 15:11:05 -07:00
Francis Lam	0379b64ac4	add seed option to stable_diffusion (#1784 ) useful for testing correctness of model runs	2023-09-05 19:45:15 -07:00
Karan Handa	a8aa13dc91	[ready] Replacing os with pathlib (#1708 ) * replace os.path with pathlib * safe convert dirnames to pathlib * replace all os.path.join * fix cuda error * change main chunk * Reviewer fixes * fix vgg * Fixed everything * Final fixes * ensure consistency * Change all parent.parent... to parents	2023-08-30 10:41:08 -07:00
Umut Zengin	1682e9a38a	Fix: Stable Diffusion index (#1713 )	2023-08-30 00:21:10 -04:00
George Hotz	aa7c98722b	sd timing (#1706 )	2023-08-28 20:22:57 -07:00
nimlgen	1c0449e190	add cache collector (#1595 ) * init cache collector * add test_cache_collector.py * switch GlobalCounters.cache to CacheCollector * init jit models test * jitted SD * add debug msg to print loaded bufs count * moved cache collctor to jit * clearer SD * no double device import	2023-08-28 19:59:55 -07:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
George Hotz	b9feb1b743	fp16 support in stable diffusion	2023-08-20 05:37:21 +00:00
George Hotz	47f18f4d60	[New] SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1516 ) (#1518 ) * Refactor AttnBlock, CrossAttention, CLIPAttention to share code * Reshape and transpose in loop * Bugfix on attention mask Co-authored-by: Jacky Lee <39754370+jla524@users.noreply.github.com>	2023-08-10 15:04:18 -07:00
George Hotz	c82bd59b85	Revert "SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1513 )" (#1515 ) This reverts commit `85e02311a2`.	2023-08-10 09:08:51 -07:00
Jacky Lee	85e02311a2	SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1513 ) * Refactor AttnBlock, CrossAttention, CLIPAttention to share code * Reshape and transpose in loop	2023-08-10 08:52:33 -07:00
George Hotz	d78fb8f4ed	add stable diffusion and llama (#1471 ) * add stable diffusion and llama * pretty in CI * was CI not true * that * CI=true, wtf * pythonpath * debug=1 * oops, wrong place * uops test broken for wgpu * wgpu tests flaky	2023-08-06 21:31:51 -07:00
Felix	97a6029cf7	Corrected a few misspelled words (#1435 )	2023-08-04 16:51:08 -07:00
George Hotz	f27df835a6	delete dead stuff (#1382 ) * delete bpe from repo * remove yolo examples * Revert "remove yolo examples" This reverts commit `cd1f49d466`. * no windows	2023-07-31 11:17:49 -07:00
George Hotz	bfbb8d3d0f	fix ones, BS=2 stable diffusion, caching optimizer (#1312 ) * fix ones, BS=2 stable diffusion * caching optimizer * print search time * minor bug fix	2023-07-21 09:55:49 -07:00
George Hotz	f45013f0a3	stable diffusion: remove realizes we don't need	2023-07-20 19:53:07 -07:00
George Hotz	b58dd015e3	stable diffusion: remove import numpy as np	2023-07-20 19:35:44 -07:00
George Hotz	35bc46289c	stable diffusion: use new tinygrad primitives	2023-07-20 19:25:49 -07:00
AN Long	f75de602df	fix typo in stable diffusion example (#1219 )	2023-07-11 15:26:40 -07:00
Diogo	2d4370b487	Adds tril & triu support (#936 ) * triu & tril support * lint and kernel count error * switched shape indicies * larger shape tests * reverted numpy removal until #942 is resolved	2023-06-09 22:13:20 -07:00
Diogo	3bb38c3518	limit split to 1 due to windows path containing : (#944 )	2023-06-06 10:27:54 -07:00
George Hotz	ed1963b899	Fast DiskTensor to other Tensor (#916 ) * make disktensors fast * loading * loader for sd and llama	2023-06-03 12:25:41 -07:00
George Hotz	46d419060b	start on mlperf models	2023-05-10 16:30:49 -07:00
Kirill	0fe5014b1f	Use pathlib (#711 ) * Use pathlib in llama * Use pathlib in stablediffusion	2023-03-18 13:49:21 -07:00
Kirill	af7745073f	Add comments to SD (#686 ) * Add explanation for empty lambdas * Fix my_unpickle if pytorch_lightning is installed * oops	2023-03-12 10:56:49 -07:00
George Hotz	b1206bcb18	third try at torch loading (#677 ) * third try at torch loading * numpy fixed * fix enet compile * load_single_weight supports empty weights * oops, CPU wasn't the default * so many bugs	2023-03-10 19:11:29 -08:00
George Hotz	8bf75a7fdd	fix stable diffusion and CI	2023-03-10 17:48:12 -08:00
Pankaj Doharey	9d97d97b26	Opens image in default viewer after saving. (#612 )	2023-03-03 17:28:49 -08:00
Jacky Lee	c35fcc6964	Replace phrase for prompt (#555 )	2023-02-12 09:04:44 -08:00
Kirill	27154db99a	Downloads weights in examples/stable_diffusion.py (#537 ) * Downloads weights in examples/stable_diffusion.py * use download_file_if_not_exists in fetch * make consistent with previous NOCACHE behavior	2023-02-10 14:37:04 -06:00
Jacky Lee	f08187526f	Fix examples (#540 ) * Fix examples * Remove training in parameters * Simplify a bit * Remove extra import * Fix linter errors * factor out Device * NumPy-like semantics for Tensor.__getitem__ (#506) * Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None * Fixed pad2d * mypy doesn't know about mlops methods * normal python behavior for out-of-bounds slicing * type: ignore * inlined idxfix * added comment for __getitem__ * Better comments, better tests, and fixed bug in np.newaxis * update cpu and torch to hold buffers (#542) * update cpu and torch to hold buffers * save lines, and probably faster * Mypy fun (#541) * mypy fun * things are just faster * running fast * mypy is fast * compile.sh * no gpu hack * refactor ops_cpu and ops_torch to not subclass * make weak buffer work * tensor works * fix test failing * cpu/torch cleanups * no or operator on dict in python 3.8 * that was junk * fix warnings * comment and touchup * dyn add of math ops * refactor ops_cpu and ops_torch to not share code * nn/optim.py compiles now * Reorder imports * call mkdir only if directory doesn't exist --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-02-10 12:09:37 -06:00
George Hotz	5e37f084db	stable diffusion: clean up constant folding	2023-02-01 12:53:16 -08:00
Jacky Lee	486f023e81	Rename Normalize and move to nn (#513 ) * Rename Normalize and move to nn * Match PyTorch for dim>1	2023-02-01 11:55:03 -08:00
George Hotz	487685919b	Revert "Rename Normalize and move to nn (#415 )" (#474 ) This reverts commit `d768acb6a9`.	2023-01-25 07:50:04 -08:00
Jacky Lee	d768acb6a9	Rename Normalize and move to nn (#415 ) * Rename Normalize and move to nn * Fix comparison to None error * Add test for GroupNorm * Rename test case * Flip parameters to match PyTorch * Increase error tolerance * Fix elementwise_affine on channels * Match arguments with PyTorch * Initialize weight and bias only when affine is true * Is this it? * A bit cleaner * Handle case where weight or bias is None	2023-01-25 07:47:59 -08:00

1 2

91 Commits