Commit Graph

1169 Commits

Author SHA1 Message Date
George Hotz
21184ae6b1 bump cache to 14 (#13530) 2025-12-02 08:02:19 -08:00
nimlgen
77a76d1b13 device: respect compiler ContextVars (#13523)
* device: envvars for cc

* fix

* fix

* x

* um

* fix

* remote

* em

* cleanup

* typing

* fix

* debug

* lvp?

* ugh

* singl

* rm

* lol

* fix

* ?

* this?

* why?

* rev

* mod test

* l
2025-12-02 14:42:04 +03:00
nimlgen
455dd88236 nv: minimal hevc (#13502)
* nv: minimal hevc

* validate

* not needed

* tralin

* var

* cpu

* fxi

* desc

* move

* cleanup
2025-11-30 16:46:55 +03:00
Sieds Lykles
63a931ff76 Symbolic divisor fuzzer (#13433)
* render z3 range better

* working version

* rename

* add to workflow

* factor out variable_names

* smaller expressions

* smaller

* + back
2025-11-23 20:29:32 +01:00
Christopher Milan
310da2a201 remove hashFiles in setup-tinygrad (#13423)
* fix hashFiles in setup-tinygrad on macos

* remove hashFiles altogether
2025-11-22 17:47:10 -05:00
qazal
903eec3754 fix sz.py tinygrad import in ci (#13418) 2025-11-22 19:20:26 +08:00
wozeparrot
1f648bb1ba feat: reenable mobilenetv2 dsp (#13320) 2025-11-21 15:21:49 -08:00
Christopher Milan
de3593957f Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388)
This reverts commit 0901a40685.
2025-11-20 15:36:13 -05:00
Christopher Milan
4043489803 set curl -f in setup-tinygrad (#13389)
* set curl -f in setup-tinygrad

* test bad redirect

* Revert "test bad redirect"

This reverts commit ad945e7ffc.
2025-11-20 13:45:47 -05:00
Christopher Milan
0901a40685 Revert "autogen: fix formatting on zero-argument function-like macros (#13386)" (#13387)
This reverts commit 58d85d4bab.
2025-11-20 12:45:35 -05:00
Christopher Milan
58d85d4bab autogen: fix formatting on zero-argument function-like macros (#13386)
* fix formatting on zero-argument function-like macros

* autogen tests should run

* ugh
2025-11-20 12:11:04 -05:00
Roelof van Dijk
0dc2ff431d fix: revive torch backend (#13280)
* fix: revive torch backend

* as_strided view vs copy

* Revert "as_strided view vs copy"

This reverts commit 82a61223f2.

* add extra tests (move inplace, add fusion tests)

* better fusion with inplace_op

* no optimizer hooks (break mnist training fusion)

* split off fusion tests in separate file, assert on resnet fusion

fix: remove comments

* cleanup, reduce diff

* reduce diff

* better fusion and identity checks

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-19 15:26:50 -08:00
George Hotz
1a332afa76 spec test on 3.14 (#12957) 2025-11-19 00:43:04 -08:00
chenyu
6372c95094 disable benchmark MobileNetV2 on DSP (#13305)
failed on tinyc2
2025-11-16 09:42:52 -05:00
Christopher Milan
5b823af696 Remove (pypi) clang dep for autogen (#13284)
* no more clang

* regen comgr_3

* ci doesn't need pypi clang

* fix objc

* REGEN for libclang

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-15 09:05:11 -08:00
George Hotz
df53c62a9f bump line count 2025-11-15 08:16:20 -08:00
Christopher Milan
d1bb08c5a1 In-tree autogen: objective c (#13223)
* checkout changes from autogen branch

* move assert

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-14 14:08:42 -08:00
nimlgen
14eb48b13a autogen: rename nv_gpu to nv_570 (#13273)
* autogen: rename nv_gpu to nv_570

* rename
2025-11-14 20:07:19 +08:00
George Hotz
44d84228ff move comgr_3 logic back to the old place (#13266)
* move comgr_3 logic back to the old place

* explicit
2025-11-13 20:05:54 -08:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
Harald Schäfer
3af231904e openpilot compile tests: assert pre-rangify speeds (#12775)
* assert pre-rangify speeds

* typo

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 09:39:06 -08:00
George Hotz
263b724143 one cache and bump it (#13258) 2025-11-13 07:33:31 -08:00
chenyu
3f939f3d3c update pm_simplify_valid (#13241)
* update pm_simplify_valid

fixed openpilot conv regression

* IMAGE training is broken
2025-11-12 19:40:02 -05:00
George Hotz
ab9fa964d8 DISABLE_COMPILER_CACHE -> CCACHE (#13234)
* DISABLE_COMPILER_CACHE -> CCACHE

* Fix cachekey assignment in Compiler constructor
2025-11-12 15:07:09 -08:00
Christopher Milan
41a098a82d In-tree autogen: libc.py (#13217)
* checkout changes from autogen branch

* parents

* pylint happy

* move sys to system in helpers.py

* typo

* typo
2025-11-11 19:13:48 -08:00
chenyu
23b90945c3 add a benchmark for openpilot vision with DEBUG=2 (#13219)
see per kernel speed, also disable the jobs for 0.9.9
2025-11-11 14:41:52 -05:00
Gaétan Lepage
6fd7ce3832 migrate to pyproject.toml (#13189)
* migrate to pyproject.toml

* move mypy config to pyproject.toml
2025-11-11 09:09:27 -08:00
chenyu
60e55d9a2d line count 18500 (#13191) 2025-11-10 13:52:13 -05:00
chenyu
6c48c87e51 improved ASSERT_MIN_STEP_TIME (#13182)
* improved ASSERT_MIN_STEP_TIME

getting close, current time +1ms  then round up

* relax
2025-11-09 16:41:12 -05:00
chenyu
e1d46de8f8 update GROUPTOP heuristic more (#13178)
reverts #13176
2025-11-09 02:31:12 -05:00
chenyu
8e868dced8 only GROUPTOP one reduce kernel (#13176)
* only GROUPTOP one reduce kernel

* ALLOWED_GATED_READ_IMAGE=148
2025-11-08 22:38:44 -05:00
George Hotz
42b34cf83d bottom up linearizer (#13133)
* bottom up linearizer

* late stores

* more complete

* remove broken heuristic

* upcast size

* opt

* more conservative

* it needs that

* disable opencl half on QCOM

* fix

* make that a real test

* cpu test okay

* ptx skip

* end is after the range
2025-11-06 15:30:32 -08:00
chenyu
54141e9cb9 DISABLE_COMPILER_CACHE=1 in speed_v_theoretical (#13096) 2025-11-04 11:28:18 -05:00
chenyu
ddf01fdb15 revert mlperf.yml setting (#13080) 2025-11-03 15:24:13 -05:00
chenyu
a317d6e625 extra/amdpci/setup_python_cap.sh (#13070) 2025-11-02 19:19:36 -05:00
chenyu
ad501ce50a mlperf cron install tqdm (#13069)
one more...
2025-11-02 18:09:27 -05:00
chenyu
2c8d619147 mlperf cron install influxdb3-python (#13068) 2025-11-02 17:55:40 -05:00
chenyu
4c22f089fc mlperf cron install tensorflow try 2 (#13067) 2025-11-02 17:11:01 -05:00
chenyu
c58cf91850 mlperf cron install tensorflow (#13066) 2025-11-02 16:48:05 -05:00
chenyu
74db65cf72 update mlperf bert LOGMLPERF (#13065) 2025-11-02 15:26:37 -05:00
chenyu
b18293de96 train bert in mlperf cron (#13064)
more relevant now
2025-11-02 15:04:02 -05:00
George Hotz
036ee9f84c Self type + mixins (#13056)
* use Self type

* mixin

* fix later
2025-11-02 13:30:01 +08:00
George Hotz
65a0a31475 AMD mi350x matmul from stream (#13040)
* works

* working mfma

* 120 TFLOPS

* regs

* 192 TFLOPS

* try pipelining

* something

* notes

* contract

* linter to 3.11

* that was a bug
2025-11-01 17:55:19 +08:00
nimlgen
f6786c1bfd autogen: py314 (#13038)
* autogen: py314

* bump py?
2025-11-01 04:02:19 +08:00
George Hotz
5eb87ab131 hotfix: bump cifar time to 350 2025-10-30 17:29:20 +08:00
nimlgen
4b001ec723 amd: pmc in mockgpu (#13000)
* amd: pmc in mockgpu

* fix

* do not open in ci
2025-10-30 01:52:02 +08:00
b1tg
bb307b9e81 fix fp8 vectorization (#12977)
* fix fp8 vectorization

* add fp8 tc to benchmark
2025-10-28 13:55:30 -04:00
George Hotz
5e01cc299b zero len ranges fail (#12974)
* zero len ranges fail

* fix Python backend

* fix llvm

* fix ptx

* yolo fix nir

* this works...

* always store...

* always store...

* Revert "always store..."

This reverts commit 0816cf344d.
2025-10-28 22:49:55 +08:00
George Hotz
e936aa7974 cleanups from if range branch (#12973) 2025-10-28 20:58:47 +08:00
George Hotz
2832954bcb test with IGNORE_OOB=0 (#12960) 2025-10-28 10:32:19 +08:00