Commit Graph

10417 Commits

Author SHA1 Message Date
Daniel Davis
4998bf49b3 Basic editorconfig support (#422)
Almost every IDE or texteditor supports
[editorconfig](https://editorconfig.org/).
I've set it up to just enforce the 2 space python indents for now.
2022-11-08 10:34:25 -08:00
marcojob
c3d9c9b24c Fix issue where batch_invstd not being set (#421)
batch_invstd can be falsely assumed to be set, even though it is None
since hasattr will not return false in this case
BatchNorm2D a reshape will be attempted then, which causes an exception
2022-11-08 09:24:53 -08:00
Liam
8dc28dd733 Create python-publish.yml (#163) v0.4.0 2022-11-08 08:45:01 -08:00
George Hotz
92ed87b0a5 bump version to 0.4.0 2022-11-08 08:44:42 -08:00
George Hotz
9781b4c3af rename test functions to helper_ 2022-11-07 21:27:56 -08:00
George Hotz
9884be2ad5 ugh, that too 2022-11-07 21:21:35 -08:00
George Hotz
537a9eb414 fix termcolor import 2022-11-07 21:19:08 -08:00
George Hotz
2cc1d970c6 updates from the chonker branch 2022-11-07 21:12:08 -08:00
George Hotz
d878065ece Gemm (#416)
* gemm

* off by factor of 5

* 50 GFLOPS

* works

* 91 gflops

* working at 50G

* works

* iy

* 150 GFLOPS

* 150 GFLOPS

* N=2048 is still fast

* threading soon

* multithread

* pinning

* throttling is sad

* Align matrices to cacheline width (#361)

Co-authored-by: cloud <Cloud11665@gmail.com>
2022-11-06 10:07:28 -08:00
George Hotz
caea34c529 1s are always mergable 2022-11-03 10:50:48 -07:00
George Hotz
c48fc47d01 fix type error 2022-10-31 09:56:56 -07:00
George Hotz
9585b6c0cf comments and readability in lazy.py 2022-10-30 19:50:48 -07:00
George Hotz
db2da22a04 stop blowing up floats 2022-10-30 16:47:16 -07:00
George Hotz
8afc643bb1 fix bug in ops test, it was cheating somehow 2022-10-30 16:43:24 -07:00
George Hotz
b7a115e5e5 rewrite some strideds into reshapes 2022-10-30 16:31:27 -07:00
George Hotz
8c849e637c that was in there twice, DEBUG>=4 to see loop opt 2022-10-30 15:31:39 -07:00
George Hotz
cfdf803b52 fix llvm vectorization by add analysis passes from the target machine 2022-10-30 15:28:36 -07:00
George Hotz
2f602a92ff seperate STRIDED and EXPAND 2022-10-30 13:23:58 -07:00
George Hotz
544cb0a069 oops, remove while(1) 2022-10-29 14:05:13 -07:00
George Hotz
4b6097f81d more amx notes 2022-10-29 14:04:10 -07:00
George Hotz
fdb43fe553 gemm is 1.7 TFLOPS on a single M1 core 2022-10-29 13:42:33 -07:00
George Hotz
52bfbc31be vectorization 2022-10-29 12:47:52 -07:00
George Hotz
e473d35f90 llvm doesn't vectorize 2022-10-29 11:59:48 -07:00
George Hotz
86eb06eb76 accurate flop estimation 2022-10-28 19:13:20 -07:00
George Hotz
7909786dbf one more opt test 2022-10-28 18:37:53 -07:00
George Hotz
dd543fbc7a MovementOps is unused 2022-10-28 18:26:08 -07:00
George Hotz
71b336503f no RESHAPEs in the AST 2022-10-28 18:25:30 -07:00
George Hotz
294ab9e2f8 more test opt 2022-10-28 18:04:12 -07:00
George Hotz
f885ceb695 test speed w/o bias 2022-10-28 11:22:15 -07:00
George Hotz
3735e26492 very minor 2022-10-28 09:39:30 -07:00
George Hotz
c0050fab8f clean up movement_op in cpu and torch 2022-10-28 09:29:12 -07:00
George Hotz
df31dde174 hasattr and DeviceBuffer type fixups 2022-10-28 09:05:45 -07:00
George Hotz
e6b65f8e01 fix graph in openpilot/compile.py 2022-10-28 08:55:34 -07:00
George Hotz
1013540370 fix flake8 2022-10-28 08:52:53 -07:00
George Hotz
804b2dd001 move into graph.py 2022-10-28 08:50:11 -07:00
George Hotz
8517b69bfb lazy cleanups 2022-10-28 08:43:43 -07:00
George Hotz
d02f8f9bc0 can we lose the lines with E701 still there? 2022-10-28 08:36:03 -07:00
George Hotz
ef62db3186 cleanups, remove E701 2022-10-28 08:28:56 -07:00
George Hotz
b65b70812a Exec AST (#404)
* working exec ast

* exec_ast is staticmethod

* GenericExecAST

* fold that sometimes

* ExplicitExecAST

* exec_ast for GPU

* gpu working

* get_lazyop_shape

* now gpubuffer is ExplicitExecAST

* dedup

* add a type

* RESHAPE in opencl code

* fix linter

* that too for linter

* cleanups

* remove dead code

* GenericShape is less lines

* add ALLOWED_KERNEL_COUNT to tests

* fix mypy

* that's gotta be recursive

* fix opencl shape processing

* remove unneeded lambda
2022-10-28 08:27:03 -07:00
George Hotz
6a15fd3844 LLVM Backend take 2 (#403)
* take 2 llvm

* get_lazybuffers -> get_buffers

* llvm tests pass

* fix type issues and refactor LLVM
2022-10-26 20:32:31 -07:00
George Hotz
10921a60c4 more imports from llvm branch 2022-10-26 18:02:36 -07:00
George Hotz
463995e64f relu simpler backward pass 2022-10-26 17:57:32 -07:00
George Hotz
6a8fb53304 move ops.py into lazy.py (#402)
* move ops.py into lazy.py

* fix graph and linter

* ugh, didn't add
2022-10-25 13:58:03 -07:00
George Hotz
8e22d5ee67 replace networkx with defaultdict 2022-10-20 19:36:43 -07:00
George Hotz
3b9b7eda48 remove run_thneed dead code 2022-10-20 17:24:18 -07:00
George Hotz
63f9c55156 really dumb bug 2022-10-20 17:07:47 -07:00
George Hotz
1bec4651b3 fix nonstatic weights 2022-10-20 17:04:14 -07:00
George Hotz
59143bbb3b raise, don't assert 2022-10-20 16:32:34 -07:00
George Hotz
9f8c414589 might fix tests 2022-10-20 16:27:11 -07:00
George Hotz
fd6ba8e7ac don't recopy backing 2022-10-20 16:06:11 -07:00