George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
8a04a3a77a
rename LazyBuffer -> UOp [pr] ( #8169 )
...
* rename LazyBuffer -> UOp [pr]
* fix docs
2024-12-11 16:15:52 -08:00
George Hotz
3169cb386d
remove graph [pr] ( #7085 )
2024-10-16 11:40:07 +08:00
George Hotz
4df5c7a4ef
move lazy to engine [pr] ( #6886 )
...
* move lazy to engine [pr]
* engine.lazy
2024-10-04 23:19:26 +08:00
George Hotz
d438d5698d
bring buffer back to device ( #4517 )
2024-05-10 11:22:31 -07:00
chenyu
c71627fee6
move GlobalCounter to helpers ( #4002 )
...
break circular import between ops and buffer
2024-03-30 00:30:30 -04:00
George Hotz
9a6ac2a50a
create the buffer with the LazyBuffer ( #3977 )
...
* create the buffer with the LazyBuffer
* fixes
* hack underlying buffer when we change dtype
* we only care about allocated buffers
* asserts
2024-03-28 19:31:28 -07:00
George Hotz
c81ce9643d
move globalcounters to ops ( #2960 )
...
* move globalcounters to ops
* missed a few
* sick of that failing
2024-01-01 14:21:02 -08:00
George Hotz
1765849937
new lazy, benchmark ( #2878 )
...
* lazy rewrite, try 2
* min fix tests
* pass contig test
* put broken pads back
* move that to realize
* no contig child fixes array packing
* so wrong
* now that's correct
* base children
* fix bind issues
* disable to_image_idx
* fix tests
* that failure shouldn't break other tests
* more fixes
* fix torch
* skip failing tests in CI
* 1e-7
* half is broken
* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
chenyu
73cadfbb3c
Remove pytest markers ( #2831 )
...
* remove pytest marker
* fix some, skip some
* tweak
* fix
* skip slow
* skip more
2023-12-18 18:53:28 -05:00
George Hotz
2f7aab3d13
move optimize_local_size ( #2221 )
...
* move optimize_local_size
* interpret_ast
2023-11-05 21:00:52 -08:00
George Hotz
f5467cfedc
Devicebufferless ( #708 )
...
* runs one metal kernel
* conv2d works
* ops tests are passing
* const folding
* all ops work
* pre commit always passes
* torch works
* working still
* fix graph test
* tests passing
* image almost works
* image conv works
* most images
* fix custom
* fix assignment
* fix compile enet
* clean up comments
* fix realize return value
* include shapetracker in LB repr
* copy should make a copy
* reenable method cache
* fix lna
* dtypes in graph
* forward only for IMAGE=2
* simple realize
* getting close
* fixup new api, it's good except the kernel count
* back to 197 kernels
* tests should pass
* go to a real float
* no type_on_cpu
* fix the docs
* put shapetracker back in it's proper place
2023-03-18 14:40:23 -07:00
George Hotz
bfcec234a2
Refactor ASTs ( #622 )
...
* ugh worst branch name
* compiler refactor continues
* scc -> cloc
* buf -> _buf
* finish _buf, and program -> runtime
* gpu is still working, clang isn't
* clang in new style
* ops_metal
* something broke it
* improve metal
* clean up tons of cl crap
* hack fix sync
* cleaner gpu
* gpu metal clang
* cleanups
* minor refactor
* GPUCodegen
* fix up LLVM
* blind CUDA refactor
* codegen / runtime
* keep ops naming
* linter passes
* woah, llvm was allocing 4x what it needed to
* bugfixes
* fix openpilot compiler
* fix compile_efficientnet
* method cache should fix tests
* deal with duped functions
2023-03-01 18:57:29 -08:00
George Hotz
643e8b0388
fix tests, test bn evaluate too
2023-02-27 10:39:47 -08:00
George Hotz
fed95119dc
CL.mem_used -> GlobalCounters.mem_used
2023-02-10 23:13:29 -06:00
George Hotz
3d63934995
refactor to keep cl in the runtime ( #545 )
...
* refactor to keep cl in the runtime
* fix thneed, rename cl to _cl
* bugfix + _cuda
* fix tests
* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
fff1f046b0
Simple version of the new GPU backend ( #458 )
...
* newgpu
* more to delete
* hmm, tests pass with constant folding
* fix lint/type
* fix constant folding
* comment and rerun tests
* lazy touchups
* fix graph_batchnorm test
* smaller transformer to fix OOM
* Revert "smaller transformer to fix OOM"
This reverts commit a44ef8edc2 .
* no func cache
* introspect
* touchups
* CLASTKernel
* ugh, it was lru_cache
* codegen
* spacing
* old gpu still in opencl
* typing fix
2023-01-10 19:16:02 -08:00
George Hotz
6a8fb53304
move ops.py into lazy.py ( #402 )
...
* move ops.py into lazy.py
* fix graph and linter
* ugh, didn't add
2022-10-25 13:58:03 -07:00
George Hotz
0516359af8
fix stupid OPENCL=1 OOM
2022-09-06 14:29:23 -07:00