tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-14 01:18:26 -05:00

Author	SHA1	Message	Date
chenyu	50927defad	s/lazydata.realized/lazydata.base.realized/g (#2914 ) * s/lazydata.realized/lazydata.base.realized/g * not that	2023-12-22 14:45:13 -05:00
chenyu	7dc3352877	increase stable diffusion validation threshold 1e-4 -> 3e-4 (#2897 ) saw a flaky CI failure with 1.1e-4, and 3e-4 is a good number	2023-12-21 11:45:25 -05:00
George Hotz	64dded27f0	pad ops broke coder (#2881 ) * pad ops broke coder * that contiguous fixes it * Update lazy.py	2023-12-20 17:03:41 -08:00
George Hotz	1765849937	new lazy, benchmark (#2878 ) * lazy rewrite, try 2 * min fix tests * pass contig test * put broken pads back * move that to realize * no contig child fixes array packing * so wrong * now that's correct * base children * fix bind issues * disable to_image_idx * fix tests * that failure shouldn't break other tests * more fixes * fix torch * skip failing tests in CI * 1e-7 * half is broken * 1e-6 margin of error	2023-12-20 14:33:21 -08:00
chenyu	857c35d256	make gpt2 decode output just once at the end (#2869 ) also updated function name from greedy_until to generate, as it's not greedy nor until	2023-12-20 12:14:55 -05:00
chenyu	6d7e9e0a56	hotfix convert Y_train to int before passing into index (#2850 )	2023-12-19 11:40:56 -05:00
chenyu	0723f26c80	dtypes.default_float and dtypes.default_int (#2824 )	2023-12-18 12:21:44 -05:00
George Hotz	c6eb618013	tests from new lazy branch (#2774 ) * tests from new lazy branch * fix lin 11 * that was needed * doesn't fail * mark * meant that * llvm passes	2023-12-14 23:06:39 -08:00
chenyu	a044125c39	validate stable diffusion for seed 0 (#2773 ) * validate stable diffusion for seed 0 the closest false positive i can get is with the setup and one less step. dist = 0.0036 same setup with fp16 has dist=5e-6. so setting validation threshold to 1e-4 should be good * run with --seed 0	2023-12-15 00:07:09 -05:00
chenyu	9afa8009c1	hot fix explicitly set arange dtype to float (#2772 )	2023-12-14 23:14:38 -05:00
chenyu	c0f76ed4ea	transformer kvcache and mask have same dtype as input (#2771 ) * transformer kvcache and mask have same dtype as input * don't use `=0` in cstyle ternary where * (bool) * where float16 test	2023-12-14 22:41:51 -05:00
jaredeh	d8952fc575	updating to work with new internal apis (#2755 )	2023-12-13 21:54:47 -08:00
Ivan Vnučec	8d206f6bfd	fix help message (#2705 ) llama -> mixtral	2023-12-10 22:04:35 -08:00
George Hotz	59ab3675a3	faster mixtral + green for new kernels (#2701 ) * green for new kernels * track ram	2023-12-10 19:04:58 -08:00
George Hotz	b01e3907a1	mixtral touch up: two lines	2023-12-10 17:21:49 -08:00
George Hotz	b3982187d1	Mixtral Example (#2691 ) * mixtral * simpler * global counters * simpler * weights arg	2023-12-10 17:18:31 -08:00
George Hotz	0fd44259cd	bf16 fix + cleanups from mixtral (#2698 ) * bf16 fix + cleanups from mixtral * generic bf16 cast	2023-12-10 16:31:52 -08:00
chenyu	fae5394845	validate llama output (#2681 ) * validate llama output * does not work with quantize	2023-12-08 16:42:01 -05:00
nickovaras	182d067407	Update yolov3.py (#2680 ) The current yolov3 example is broken with the current implementation of of fetch in the helpers. I was tempted to fix the helpers instead but that could have just as well broken other examples.	2023-12-08 12:59:38 -08:00
George Hotz	00d9eda961	FROM -> COPY, move vars_from_ast (#2675 )	2023-12-07 16:32:30 -08:00
chenyu	539b00a645	move llama getenv("JIT") from models to examples (#2671 ) Transformer class has a jit param so we should use that in the caller	2023-12-07 12:43:22 -05:00
chenyu	371005cb2d	use one kvcache tensor in gpt2 instead of two separate caches (#2662 ) * use one kvcache tensor in gpt2 * test case * is None * better test cases	2023-12-06 20:59:17 -05:00
chenyu	0978c24b8e	fast gpt2 embedding with variable bs=1 (#2596 )	2023-12-05 23:01:17 -05:00
chenyu	229ada5fe5	Gpt2 benchmark with HALF and BEAM (#2636 ) * benchmark gpt2 with half and beam * BEAM=4 * optional validation * green is good * we care	2023-12-05 22:15:16 -05:00
Oleg Rybalko	7c427d738c	don't apply padding on script call (#2585 ) * don't apply padding on script call * no need for new param because batch_size value can be utilized to check * fixed argument naming	2023-12-05 16:34:10 -08:00
George Hotz	9d7ead84e1	hotfix: no need for model cache in examples/coder.py	2023-12-05 16:27:36 -08:00
George Hotz	232ed2af3f	more test cleanups (#2631 ) * more test cleanups * move test example back	2023-12-05 16:17:57 -08:00
chenyu	a63f48d3db	gpt2 half for kvcache and output logits (#2630 ) * gpt2 more half * hlaf is fine after softmax	2023-12-05 16:54:56 -05:00
George Hotz	8c67eb1c92	GPT bugfixes (#2624 ) * simple fixes * fix exp2 * fixed * parallel beam for CUDA * fix image dtypes	2023-12-05 11:42:28 -08:00
qazal	ab2d4d8d29	Fix cl import in the copy_speed test and cifar example (#2586 ) * fix CL import * update test to only run on GPU * update hlb_cifar too	2023-12-03 09:22:07 -08:00
Oleg Rybalko	5e87083783	Whisper + LLAMA + VITS (#2332 ) * feat: working voice 2 text using whisper * feat: added llama generation * feat: vits init * feat: more accurate voice conversion * feat: support for tts and working pipeline for the first pass * fix: linter checks * refactored vits initialization and inference, added mmts-tts support * fixed process sync and now we can have an infinite conversation * reuse output stream to remove overhead of creating a new one each time * added pre-prompt configuration with yaml files * adjusted code to merge PR which changed whisper * optimized whisper, now it's blazing fast and also reduced number of lines * added better debug printing * use jitted encode function for whisper, added timings and removed response delim to save speed on generating those tokens * fixed hf convert and now it's working with tinyllama * added tinyllama config * refactored code and made it work with all llama models * prettier order * prettier order * fixed suffix for tinyllama and refactored convert_from_hf * added missing parameters * fixed stream release and added missing params * jitted dp and encoder * jitted flow forward * removed re-init of espeak on each call to save up time * jitted generator forward for blazing fast tts * added contextmanager for displaying a chat log * removed whitespace for pylint * updated code to support latest fetch func * wait for llama eos token and pass params from cli to llama * listen for not fixed amount of time * refactored code a bit * removed thresholding and now the output streams directly to whisper * tokenize llama output for vits batch size to work and stream each sentence to a speaker * changed speaker * whisper is now printing on the same line * don't trigger llama on whisper output in parens * added tinyllama chat model * adjusted code to work with tinyllama chat model * removed unused cli arg * autofetch tokenizer and tinyllama model. add 3 chat tokens to the tokenizer * fixed issue with long sentences by chunking them * support for multiline llama output * prettified log output * adjusted sentence length * remove quote from response to avoid funny tts * fixed prompts * added missing parameter	2023-12-02 15:03:46 -08:00
chenyu	05a5357dd9	fix handcode_resnet50_opt.py (#2558 )	2023-12-01 20:51:21 -05:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
Davi Silva	ddeec24fa8	Cleanup & fix llama.py (#2524 ) * docs, cleanup crap * comma AI * fix 70B * this is why lexical scope exists	2023-11-30 16:00:17 -05:00
George Hotz	d87a246439	move to new cached fetch (#2493 ) * move to new cached fetch * extra.utils is over * loads * bump download cache * bump timeout	2023-11-28 17:36:55 -08:00
chenyu	a739c6646e	fp16 in gpt2 attention (#2491 ) * fp16 in gpt2 attention * HALF	2023-11-28 19:27:03 -05:00
chenyu	7f9a4c1285	fp16 and noshow flags for gpt2 (#2470 )	2023-11-27 16:23:03 -05:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
Akshay Kashyap	a031afb2f6	Update display_name in resnet50 example (#2454 )	2023-11-26 16:07:36 -08:00
George Hotz	7170a9a057	coder.py can write and run code (#2439 ) * wip mistral * coder * touchups * cleanups * mistral cleanups * clean up cache create * download the weights, fix tests * fix llama loading * global fixup * clean up all * move llama model * cleanups * Revert "cleanups" This reverts commit `a71c5d59eb`. * fine, leave it	2023-11-25 12:27:54 -08:00
Davi Silva	df41a57e09	Fix: missing n_kv_heads for smaller models from huggingface (#2438 ) * fix: missing n_kv_heads for smaller models from huggingface * a lil golfing	2023-11-25 10:29:04 -08:00
George Hotz	96c12fdeab	multibatch gpt2 (#2432 ) * support multibatch gpt-2 * multi output * no default JIT in CI	2023-11-24 18:10:10 -08:00
Francis Lata	7169de57e2	Update VITS to use fetch helper (#2422 ) * use fetch helper on vits * remove duplicate weight loading	2023-11-24 08:50:03 -08:00
George Hotz	8f89e21fca	torch and numpy don't share ops anymore (#2412 ) * torch and numpy don't share ops anymore * that should be filtered out elsewhere * still const * graph + enet example cleanup * hmm, we do still need it because of symbolic	2023-11-23 16:58:10 -08:00
George Hotz	5bb720a777	Cocoa is no longer used	2023-11-23 14:31:21 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
Francis Lata	6d672785db	Update Whisper to use fetch helper (#2401 ) * update whisper to use new fetch helper * simplify file opening * update name * update key name to "downloads-cache"	2023-11-23 12:59:59 -08:00
George Hotz	2dec86970a	hotfix: default remains gen 1 llama	2023-11-21 14:43:02 -08:00
mmmkkaaayy	7f0cc4a4e8	whisper: support audio >30s (#2378 ) * whisper: support audio >30s * make prompt indexing consistent with reference repo * fix online	2023-11-21 14:37:51 -08:00
Oleg Rybalko	7220f5c9fc	fixed hf convert and now it's working with tinyllama (#2374 ) * fixed hf convert and now it's working with tinyllama * added tinyllama config * refactored code and made it work with all llama models * prettier order * prettier order * fixed suffix for tinyllama and refactored convert_from_hf * dynamically update help if MODEL_PARAMS changes and default size is the 1st	2023-11-21 14:36:52 -08:00

... 13 14 15 16 17 ...

1207 Commits