tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 06:48:22 -05:00

Author	SHA1	Message	Date
George Hotz	cfd13c083b	refactor GenericShape for a big line reduction	2023-02-08 18:01:08 -06:00
George Hotz	c656513591	GPURunner class will replace CL cache eventually	2023-02-08 17:31:36 -06:00
George Hotz	a5a55ac19e	GlobalCounters cache + assign in optim	2023-02-08 17:10:55 -06:00
George Hotz	d9555bc478	that turned out to be dumb	2023-02-08 16:52:29 -06:00
George Hotz	3d63934995	refactor to keep cl in the runtime (#545 ) * refactor to keep cl in the runtime * fix thneed, rename cl to _cl * bugfix + _cuda * fix tests * thneed more correct	2023-02-08 16:46:09 -06:00
George Hotz	8c8a5a77dd	refactor llvm into runtime and ops	2023-02-08 16:28:32 -06:00
George Hotz	45ce4de6f3	improve typing	2023-02-08 12:48:21 -06:00
George Hotz	2e1bdc889a	write out all the functions, no auto binding (#543 ) * write out all the functions, no auto binding * cleanups, more types * Slice is for internal calls only * improve typing * ugh, put slice back	2023-02-08 12:41:39 -06:00
George Hotz	d854337f0d	nn/optim.py compiles now	2023-02-08 11:25:18 -06:00
George Hotz	1029deccb1	refactor ops_cpu and ops_torch to not share code	2023-02-08 11:11:42 -06:00
George Hotz	ee18420c13	dyn add of math ops	2023-02-08 10:04:30 -06:00
George Hotz	2844482a60	Mypy fun (#541 ) * mypy fun * things are just faster * running fast * mypy is fast * compile.sh * no gpu hack * refactor ops_cpu and ops_torch to not subclass * make weak buffer work * tensor works * fix test failing * cpu/torch cleanups * no or operator on dict in python 3.8 * that was junk * fix warnings * comment and touchup	2023-02-08 09:56:51 -06:00
George Hotz	996e0a10b7	update cpu and torch to hold buffers (#542 ) * update cpu and torch to hold buffers * save lines, and probably faster	2023-02-08 09:40:45 -06:00
Mitchell Goff	ae4f0aeb5f	NumPy-like semantics for Tensor.__getitem__ (#506 ) * Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None * Fixed pad2d * mypy doesn't know about mlops methods * normal python behavior for out-of-bounds slicing * type: ignore * inlined idxfix * added comment for __getitem__ * Better comments, better tests, and fixed bug in np.newaxis	2023-02-08 08:59:46 -06:00
George Hotz	0ac3286af0	factor out Device	2023-02-07 16:08:20 -06:00
George Hotz	2aeebd70a6	mypy will compile the shapetracker, no speed up	2023-02-07 15:43:44 -06:00
George Hotz	185d2e3678	fix map_buffer and add some __slots__	2023-02-07 15:32:48 -06:00
George Hotz	aebe75d9a2	remove val expansion (#539 ) * remove val expansion * types for all shapetracker functions: * more typing * add all the parens to the test * more types * fix tests * very minor speedup	2023-02-07 15:14:05 -06:00
George Hotz	001cc96e25	Lazy refactor (#538 ) * refactor lazy to return ASTs * a lil cleaner * oops, compare ids * gate on GRAPH * cleanups * less calls to log_op * simpler * realize_buffers -> map_buffers * even simpler * think in asts * a lil cleaner * NOOP means contiguous	2023-02-07 11:53:21 -06:00
George Hotz	02d8cb0959	lazy cleanup	2023-02-07 07:39:53 -06:00
George Hotz	d93563f39f	fix KOPT	2023-02-07 06:56:33 -06:00
Jared Z	7604b17fbf	TestZeroViewShapeTracker fix test (#481 ) * TestZeroViewST test * updated to align with st naming conventions in file * Update test_shapetracker.py	2023-02-07 06:17:55 -06:00
George Hotz	c073271f20	more symbolic correctness	2023-02-07 00:03:14 -06:00
George Hotz	e961fd3a04	more symbolic test, ModNode is wrong	2023-02-06 23:43:21 -06:00
George Hotz	8cfeb118d6	symbolic new test	2023-02-06 23:27:26 -06:00
George Hotz	7c5a5ecdac	even simpler symbolic	2023-02-06 22:47:00 -06:00
George Hotz	8b05de1841	symbolic cleanups	2023-02-06 22:12:11 -06:00
George Hotz	2a924e2b77	fix sz.sh for llvm	2023-02-06 15:36:05 -06:00
James Roberts	0d405fd5bc	Parallelize CI tests (#535 )	2023-02-06 15:27:44 -06:00
Andrey	4977d6f225	using tuples in isinstance (#534 )	2023-02-06 14:40:26 -06:00
timmermansjoy	d56c57b112	adding more robust install method (#532 )	2023-02-06 13:12:05 -06:00
George Hotz	fd3807c479	delete cherry and old cuda accel, promote llvm	2023-02-06 10:02:41 -06:00
George Hotz	90529d3750	tests are 20% faster (#529 ) * pytorch CPU * no cache, it's slower * pytorch cpu for real * remove double onnx	2023-02-06 09:56:14 -06:00
George Hotz	039de1b332	oops, pytest is for testing	2023-02-06 09:30:12 -06:00
George Hotz	6eb0e6a650	shuffle deps: always tqdm, make linting category	2023-02-06 09:27:01 -06:00
George Hotz	1d80639646	make linter test install testing deps	2023-02-06 09:21:48 -06:00
George Hotz	60bb64811c	merge mypy into linters, no useless package update	2023-02-06 09:14:00 -06:00
George Hotz	c3d81bba2a	test_train: Adam -> SGD	2023-02-06 08:55:41 -06:00
George Hotz	36c26a57b1	make slow LLVM opt optional	2023-02-05 20:24:12 -06:00
George Hotz	f7291f6ca3	fixes big KOPT, breaks opencl (#505 ) * fixes big KOPT, breaks opencl * fix optimizer * KernelCache * oops, broke batchnorm * hack to fix it * fix llvm, less hacky gpu * disable the cache * cache just breaks things	2023-02-05 10:46:17 -08:00
Martin Loretz	97f0a82be7	Cache pip packages in github actions (#522 ) * Cache pip dependencies in github actions * Add setup.py as cache-dependency-path * Test caching * Test caching * Upgrade setup python action * Test caching * Remove setup.py from cache-dependency-path * Don't remove cache-dependency-path * Don't cache linter package's * Test caching * Test caching * Test caching * Upgrade actions/checkout to v3	2023-02-03 20:04:20 -08:00
Martin Loretz	4ad67b4bbc	Refactor triton buffer to use CLBuffer of cuda runtime (#524 ) * Refactor triton buffer to use CLBuffer of runtime * Fix opencl GT0	2023-02-03 20:02:41 -08:00
Jacky Lee	ad4f6aa2cf	Add test for quick_gelu (#526 ) * Add test for quick_gelu * Bump PyTorch version for approximate	2023-02-03 20:01:39 -08:00
James Roberts	db0a9b0a2d	Refactor CL.time_sum into GlobalCounters (#519 )	2023-02-01 20:13:56 -08:00
Martin Loretz	45e847d284	Update triton to work in master (#517 ) * Update triton to work in master * Move mem_estimate out of runner	2023-02-01 12:58:14 -08:00
George Hotz	5e37f084db	stable diffusion: clean up constant folding	2023-02-01 12:53:16 -08:00
George Hotz	175c38d1b3	triton: it already was GT0	2023-02-01 12:00:33 -08:00
Jacky Lee	486f023e81	Rename Normalize and move to nn (#513 ) * Rename Normalize and move to nn * Match PyTorch for dim>1	2023-02-01 11:55:03 -08:00
George Hotz	cd97b036cc	A Triton backend for tinygrad (#470 ) * triton can add * print stuff from triton * write out file * ops triton working * reduce ops * sort of works * Triton bugfixes & implementation of remaining ops (#490) * padding * support pow, max, relu, gt0 * allocate return buffer * Fix reduce * Add tests for power op * Fix triton illegal memory accesses and memory leak (#512) * Fix mypy issue * Add triton to setup.py * Replace torch with pycuda * Use one cuda stream for data transfer and kernels * Remove triton submodule * Fix memory leak by using weakrefs for caching * Fix memory access by adding valid as mask for load * Fix invalid kernel launches by flattening the grid (#515) --------- Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com>	2023-02-01 11:53:57 -08:00
George Hotz	4e24002bbe	no generic exceptions	2023-02-01 11:14:37 -08:00

1 2 3 4 5 ...

1410 Commits