Commit Graph

1147 Commits

Author SHA1 Message Date
George Hotz
f1471a3b99 speed up rdna3 unit tests + add to CI (#13871)
* speed up rdna3 unit tests

* add test to CI

* faster and simpler

* speedups

* bugfixes

* use helper

* fix CI maybe

* test fixes

* llvm-21 on 24.04

* upd

* llvm-21

* fix test

* bring that back

* merge gen into lib

* test generators
2025-12-29 10:26:48 -05:00
chenyu
f5090192c8 reorder AMD tensor core benchmark test (#13860)
* reorder AMD tensor core benchmark test

* disable that
2025-12-28 12:29:51 -05:00
chenyu
cba05acadf re-enable TYPED=1 import test (#13858) 2025-12-28 11:49:06 -05:00
qazal
a1c1684b91 set .amdhsa_kernarg_size in asm test (#13826) 2025-12-25 13:08:14 +09:00
George Hotz
4702da41d5 hotfix: mkdir for extra/disassemblers 2025-12-19 17:18:37 -04:00
chenyu
80b84f5267 ruff lint tinykitten (#13762)
deleted used import and double spaces. a few ignore to not change the real code
2025-12-19 14:31:00 -05:00
Christopher Milan
97103831c5 Revert "remove image from BufferSpec (#13636)" (#13761)
This reverts commit 2571a1eb47.
2025-12-19 13:54:36 -05:00
Christopher Milan
2571a1eb47 remove image from BufferSpec (#13636)
* remove image from BufferSpec

* cl tiny_gemm (64) works

* mypy

* padding

* openpilot CL

* reshape properly

* remove extra qcom checks

* pad output

* mypy

* update compile test

* move undo

* TestImageCopy valid images

* TestImageRealization valid images

* TestImageDType valid images

* cleanups

* test_renderer_failures

* ruff

* mypy

* simplify ops_qcom

* bump step time
2025-12-19 13:41:20 -05:00
George Hotz
4b741e893f remove REMOTE=1 (#13722)
* remove REMOTE=1

* leave ibverbs
2025-12-16 15:58:10 -04:00
George Hotz
e5a66ace80 multi custom kernel support (#13716)
* multi custom kernel support

* custom kernel xfrom

* works

* no SPEC=2 on ck

* panic

* touchups
2025-12-16 11:36:30 -04:00
George Hotz
7589c897b2 split usbgpu tests into their own benchmark [pr] (#13711) 2025-12-15 21:42:40 -04:00
qazal
6bafd90248 remove unused process replay input [pr] (#13712) 2025-12-16 09:29:35 +08:00
George Hotz
fd49bb512d download cache by job (#13703) 2025-12-15 10:47:17 -05:00
George Hotz
316da9f7ff llm: add created/model fields, non-streaming support, and tests (#13660)
* llm: add created/model fields, non-streaming support, and tests

- Add `created` timestamp and `model` fields to response (required by OpenAI spec)
- Add non-streaming mode support for /v1/chat/completions
- Add `send_data` helper to HTTPRequestHandler for responses with Content-Length
- Refactor viz/serve.py to use send_data
- Add integration tests using real OpenAI client

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* add openai to testing

* toml

* Remove 'openai' from dependencies

Removed 'openai' from the dependencies list.

* bump cache

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 14:50:36 -05:00
George Hotz
f0fa9bcd98 openai api for llm (#13648)
* openai api for llm

* responds to simple request

* schedule cache needs to unbind

* stream works

* share stream code

* 20k

* one print

* cid
2025-12-12 08:25:33 -05:00
nimlgen
cbae33003d ci: add usb4 (#13643)
* ci: add usb4

* debug=3

* undef

* revert
2025-12-11 19:41:41 +03:00
chenyu
2471b49e45 minor bert / llama change from grad acc branch (#13622)
* minor bert / llama change from grad acc branch

* revert those
2025-12-08 16:04:14 -05:00
Christopher Milan
cb3d756547 NAK compile-only test (#13621) 2025-12-08 15:53:46 -05:00
Christopher Milan
a4c3d48aa9 compile-only test for IR3 actually works (#13619) 2025-12-08 15:07:49 -05:00
Christopher Milan
1c16b6e082 Mesa: freedreno (#12746)
* ir3 init

* got a program

* 1 + 1 works

* use isa_disasm instead of shader_disasm

* wip

* matmul works

* works on py3.14

* fix const loading

* skip QCOM failing tests

* cleanup

* args actually work

* add compile-only tests

* fix typo and install tinymesa

* IR3 NULL backend

* (float32) images work

* autogen fix

* fix compile only test

* typo

* mypy happy

* compile-only uses py3.14

* bump mesa

* unify qcom disassembler

* float16 works

* disasm shows in viz

* save a line

* add real del

* variable workgroup sizes

* simplify diff

* bump line count

* properly set wgsz

* regen mesa

* no preamble

* bump lines
2025-12-08 14:02:08 -05:00
chenyu
b981b6f89e remove old llama grad_acc (#13611)
* remove old llama grad_acc

* GRADIENT_ACC_STEPS=1
2025-12-07 13:03:47 -05:00
Christopher Milan
4eae4b0ce6 unify adreno autogen with mesa (#13604)
* unify adreno autogen with mesa

* gen pm4

* TestTiny::test_plus works

* add a6xx enums

* IMAGE=2 TestTiny::test_gemm works

* remove adreno from CI

* cleanup
2025-12-06 15:17:36 -05:00
Christopher Milan
dec2f50aee reenable process replay for lvp (#13592) 2025-12-05 12:36:35 -05:00
chenyu
ac1227575f IMAGE=1 driving_vision in benchmark (#13587) 2025-12-05 10:20:54 -05:00
qazal
6d92e9ffbf hotfix: skip process replay on lvp (#13585) 2025-12-05 19:25:23 +08:00
George Hotz
24ca8eeaa7 small fixups from schedule_cache (#13557) 2025-12-03 15:41:16 -08:00
Douglas Nyberg
f5abd38132 remove tfa dependency: use keras.optimizers.Lamb and tf.raw_ops for LARS (#13555) 2025-12-03 17:48:27 -05:00
chenyu
8902781dc1 enable more benchmarks (#13540)
* enable more benchmarks

* disable some

* adjust ASSERT_MIN_STEP_TIME

* mac NOCLANG=1
2025-12-02 20:31:14 -05:00
George Hotz
21184ae6b1 bump cache to 14 (#13530) 2025-12-02 08:02:19 -08:00
nimlgen
77a76d1b13 device: respect compiler ContextVars (#13523)
* device: envvars for cc

* fix

* fix

* x

* um

* fix

* remote

* em

* cleanup

* typing

* fix

* debug

* lvp?

* ugh

* singl

* rm

* lol

* fix

* ?

* this?

* why?

* rev

* mod test

* l
2025-12-02 14:42:04 +03:00
nimlgen
455dd88236 nv: minimal hevc (#13502)
* nv: minimal hevc

* validate

* not needed

* tralin

* var

* cpu

* fxi

* desc

* move

* cleanup
2025-11-30 16:46:55 +03:00
Sieds Lykles
63a931ff76 Symbolic divisor fuzzer (#13433)
* render z3 range better

* working version

* rename

* add to workflow

* factor out variable_names

* smaller expressions

* smaller

* + back
2025-11-23 20:29:32 +01:00
Christopher Milan
310da2a201 remove hashFiles in setup-tinygrad (#13423)
* fix hashFiles in setup-tinygrad on macos

* remove hashFiles altogether
2025-11-22 17:47:10 -05:00
qazal
903eec3754 fix sz.py tinygrad import in ci (#13418) 2025-11-22 19:20:26 +08:00
wozeparrot
1f648bb1ba feat: reenable mobilenetv2 dsp (#13320) 2025-11-21 15:21:49 -08:00
Christopher Milan
de3593957f Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388)
This reverts commit 0901a40685.
2025-11-20 15:36:13 -05:00
Christopher Milan
4043489803 set curl -f in setup-tinygrad (#13389)
* set curl -f in setup-tinygrad

* test bad redirect

* Revert "test bad redirect"

This reverts commit ad945e7ffc.
2025-11-20 13:45:47 -05:00
Christopher Milan
0901a40685 Revert "autogen: fix formatting on zero-argument function-like macros (#13386)" (#13387)
This reverts commit 58d85d4bab.
2025-11-20 12:45:35 -05:00
Christopher Milan
58d85d4bab autogen: fix formatting on zero-argument function-like macros (#13386)
* fix formatting on zero-argument function-like macros

* autogen tests should run

* ugh
2025-11-20 12:11:04 -05:00
Roelof van Dijk
0dc2ff431d fix: revive torch backend (#13280)
* fix: revive torch backend

* as_strided view vs copy

* Revert "as_strided view vs copy"

This reverts commit 82a61223f2.

* add extra tests (move inplace, add fusion tests)

* better fusion with inplace_op

* no optimizer hooks (break mnist training fusion)

* split off fusion tests in separate file, assert on resnet fusion

fix: remove comments

* cleanup, reduce diff

* reduce diff

* better fusion and identity checks

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-19 15:26:50 -08:00
George Hotz
1a332afa76 spec test on 3.14 (#12957) 2025-11-19 00:43:04 -08:00
chenyu
6372c95094 disable benchmark MobileNetV2 on DSP (#13305)
failed on tinyc2
2025-11-16 09:42:52 -05:00
Christopher Milan
5b823af696 Remove (pypi) clang dep for autogen (#13284)
* no more clang

* regen comgr_3

* ci doesn't need pypi clang

* fix objc

* REGEN for libclang

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-15 09:05:11 -08:00
George Hotz
df53c62a9f bump line count 2025-11-15 08:16:20 -08:00
Christopher Milan
d1bb08c5a1 In-tree autogen: objective c (#13223)
* checkout changes from autogen branch

* move assert

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-14 14:08:42 -08:00
nimlgen
14eb48b13a autogen: rename nv_gpu to nv_570 (#13273)
* autogen: rename nv_gpu to nv_570

* rename
2025-11-14 20:07:19 +08:00
George Hotz
44d84228ff move comgr_3 logic back to the old place (#13266)
* move comgr_3 logic back to the old place

* explicit
2025-11-13 20:05:54 -08:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
Harald Schäfer
3af231904e openpilot compile tests: assert pre-rangify speeds (#12775)
* assert pre-rangify speeds

* typo

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 09:39:06 -08:00
George Hotz
263b724143 one cache and bump it (#13258) 2025-11-13 07:33:31 -08:00