Commit Graph

4736 Commits

Author SHA1 Message Date
George Hotz
316da9f7ff llm: add created/model fields, non-streaming support, and tests (#13660)
* llm: add created/model fields, non-streaming support, and tests

- Add `created` timestamp and `model` fields to response (required by OpenAI spec)
- Add non-streaming mode support for /v1/chat/completions
- Add `send_data` helper to HTTPRequestHandler for responses with Content-Length
- Refactor viz/serve.py to use send_data
- Add integration tests using real OpenAI client

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* add openai to testing

* toml

* Remove 'openai' from dependencies

Removed 'openai' from the dependencies list.

* bump cache

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 14:50:36 -05:00
nimlgen
e36385e570 am: support xgmi systems (#13659)
* am: support xgmi systems

* fake_am
2025-12-12 18:55:45 +03:00
Jakob Sachs
ab2220b834 Handle missing bfloat16 natives on CPU architectures (#13553)
* CPU: fix compiler-rt libcall by adding intermediate casts for bfloat16

* fix lint

* remove old manual bypass of bf16 for CPU tests, and add diversion converstion from bf16 to/from fp16

---------

Co-authored-by: Jakob Sachs <jakobs99@purelymail.com>
2025-12-11 15:38:43 -05:00
chenyu
03600aef1e failed test case when init jit with empty inputs (#13641)
not related to bert grad acc, but still seems to be a bug
2025-12-10 22:03:06 -05:00
Nino Risteski
76d465dbc3 optim empty shard #13513 (#13598)
* optim empty shard

* remove tuple

* simplify

* lint

* lint2

* test

* remove original buffer unique id

* new rule

* reset shard

* update

* reset shard
2025-12-09 12:28:36 -05:00
ayanhan
47a170be2e test: enable cummax scalar IndexError test (#13625) 2025-12-09 12:25:56 -05:00
Christopher Milan
a17077d1d9 skip test_double_assign in CI LVP (#13620) 2025-12-08 14:54:02 -05:00
Christopher Milan
1c16b6e082 Mesa: freedreno (#12746)
* ir3 init

* got a program

* 1 + 1 works

* use isa_disasm instead of shader_disasm

* wip

* matmul works

* works on py3.14

* fix const loading

* skip QCOM failing tests

* cleanup

* args actually work

* add compile-only tests

* fix typo and install tinymesa

* IR3 NULL backend

* (float32) images work

* autogen fix

* fix compile only test

* typo

* mypy happy

* compile-only uses py3.14

* bump mesa

* unify qcom disassembler

* float16 works

* disasm shows in viz

* save a line

* add real del

* variable workgroup sizes

* simplify diff

* bump line count

* properly set wgsz

* regen mesa

* no preamble

* bump lines
2025-12-08 14:02:08 -05:00
Douglas Nyberg
947c6eefc3 add Swish op (#13541)
* add Swish ONNX operator

* add Swish regression test

* remove trailing whitespace

* upgrade ONNX to 1.20, add excludes for unimplemented ops

* upgrade ONNX to 1.19, add Swish op

* upgrade ONNX to 1.19, TensorFlow to 2.18, add Swish op

* exclude attention_3d and attention_4d_gqa tests

* exclude attention fp16 tests

* exclude all attention tests

* retrigger CI

* retrigger CI - worker crash
2025-12-08 12:41:18 -05:00
Christopher Milan
94d7646bdc fix anonymous struct fields (#13610) 2025-12-07 12:56:38 -05:00
nimlgen
ac5f1e115d autogen: repro for the bug (#13607)
* autogen: repro for the test

* mute
2025-12-07 15:51:03 +03:00
wozeparrot
93f1baca77 feat: tk fa in tensor (#13580) 2025-12-05 14:36:29 -08:00
George Hotz
c5bd28e21d start work on schedule cache (#13529)
* start work on schedule cache

* local unique

* schedule cache works

* schedule cache cleanup

* fix tests

* preserve metadata

* oops, fix cache

* put that there

* fix spec

* always miss

* why is that broken?

* src[0].op

* fix process replay

* delete abstractions2

* reenable the actual schedule cache

* metadata is best effort

* fix JIT in examples/gradaccum_mnist.py

* full jit

* fixed and test is real
2025-12-04 17:24:49 -08:00
chenyu
42f6cf3a90 tighter test_real_world mem and kernel count bounds (#13573)
also check if actual usage is within 20% of set limit, the old limits are too big to be useful
2025-12-04 13:35:39 -05:00
chenyu
89f9e1dcd5 add SGD to beautiful_mnist (#13571) 2025-12-04 12:17:29 -05:00
Rory Clear
6eab756578 fix and test loading num_batches_tracked (#13538)
* fix and test loading num_batches_tracked

* add failing reverse case

* try reshape state dict if mismatch

* reshape for () and (1,)

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-12-04 01:22:49 -08:00
Douglas Nyberg
a8a62bc08e add max/min reduction support to ScatterND (#13562) 2025-12-04 00:53:47 -08:00
ayanhan
edf929ec9d fix: add __delitem__ to Tensor with proper TypeError (#13561) 2025-12-04 00:53:08 -08:00
ayanhan
92b40290c7 fix: add test_sum_int and remove outdated TODO in test_custom_kernel (#13560) 2025-12-03 21:51:58 -05:00
Christopher Milan
0a54434b15 mitigate ctypes c_bool bitfield bug (#13558)
* mitigate ctypes c_bool bitfield bug

* don't delete old test
2025-12-03 20:46:04 -05:00
George Hotz
24ca8eeaa7 small fixups from schedule_cache (#13557) 2025-12-03 15:41:16 -08:00
Douglas Nyberg
f5abd38132 remove tfa dependency: use keras.optimizers.Lamb and tf.raw_ops for LARS (#13555) 2025-12-03 17:48:27 -05:00
George Hotz
a4c4e48385 add LUNIQUE op (#13554) 2025-12-03 14:34:34 -08:00
chenyu
22777a89ea minor test_uop_symbolic updates (#13551) 2025-12-03 13:17:44 -05:00
chenyu
a205f98ef4 tighter bound for MOD (#13550) 2025-12-03 11:24:29 -05:00
nimlgen
549f3287a8 fix caching for fetch (#13544) 2025-12-03 14:34:14 +03:00
George Hotz
6bd355fa26 add needs_second_gpu decorator (#13543)
* add needs_second_gpu decorator

* more skips

* two more fixes
2025-12-02 19:08:23 -08:00
wozeparrot
0d55aec605 fix after end (#13542) 2025-12-02 18:42:58 -08:00
George Hotz
055d5aeb7f add external_test_process_count 2025-12-02 17:26:30 -08:00
chenyu
e8879f7e31 match torch clamp backward (#13533)
* match torch clamp backward

* fix PYTHON
2025-12-02 17:58:32 -05:00
Roelof van Dijk
c158e3c988 add cifar gated uop_given_valid regression test (#13536) 2025-12-02 16:02:47 -05:00
Roelof van Dijk
e329baffa7 fix cifar while keeping openpilot fused (#13528)
* this works

* test now passes
2025-12-02 12:05:56 -08:00
nimlgen
0874ba8cc8 test_hevc: do not download the whole file (#13531)
* test_hevc: do not download the whole file

* fix
2025-12-02 21:31:28 +03:00
qazal
366badaa68 require renderer argument in get_program, removes device opening in process replay [pr] (#13524) 2025-12-03 02:05:31 +08:00
Douglas Nyberg
6a7c58abf1 fix(onnx): unwrap list/tuple value in Pad op (#13500)
* fix(onnx): unwrap list/tuple value in Pad op

* add regression test for Pad list value

* remove trailing whitespace

* use _resolve_const for Pad constant_value
2025-12-02 07:47:20 -08:00
nimlgen
77a76d1b13 device: respect compiler ContextVars (#13523)
* device: envvars for cc

* fix

* fix

* x

* um

* fix

* remote

* em

* cleanup

* typing

* fix

* debug

* lvp?

* ugh

* singl

* rm

* lol

* fix

* ?

* this?

* why?

* rev

* mod test

* l
2025-12-02 14:42:04 +03:00
wozeparrot
1b7dbfb37f tk: named kernels + per kernel range id (#13522) 2025-12-01 22:51:04 -08:00
nimlgen
455dd88236 nv: minimal hevc (#13502)
* nv: minimal hevc

* validate

* not needed

* tralin

* var

* cpu

* fxi

* desc

* move

* cleanup
2025-11-30 16:46:55 +03:00
George Hotz
fd373fea7a fix a few tests [pr] (#13498) 2025-11-29 13:43:45 -08:00
George Hotz
6a140f74fe split out unique_const and cache const [pr] (#13493)
* split out unique_const

* add cache to const

* call const in unique_const
2025-11-29 10:44:28 -08:00
George Hotz
c38b7684dc improve microbenchmarks (#13492)
* improve microbenchmarks

* bugfix + ubench

* lil

* no src in const method
2025-11-29 10:15:22 -08:00
kamilisjon
3d76ef9ba8 Update tests (#13479) 2025-11-28 18:35:28 -08:00
qazal
ae9c56134e skip test_tk failing locally on macbook (#13476) 2025-11-29 01:15:37 +08:00
qazal
72ef533d9c tracing: use u32 for buffer args encoding (#13472) 2025-11-28 00:19:51 +08:00
George Hotz
18addc0a1d process replay only get_program (#13475) 2025-11-27 08:18:18 -08:00
George Hotz
a8e005b095 enable process replay (non-checking) by default (#13474) 2025-11-27 07:28:44 -08:00
George Hotz
05cd2279d0 add cache on reshape (#13466)
* remove cache on divmod, way less objects

* _apply_reshape

* reshape

* no gc on realize

* wow that cache is fast
2025-11-26 18:57:40 -08:00
George Hotz
19228e8d37 test_graph is flaky 2025-11-26 16:37:42 -08:00
George Hotz
e4cd649ff0 remove kernelize to prepare for refactors (#13463)
* remove kernelize to prepare for refactors

* less kernelize

* last test
2025-11-26 14:18:50 -08:00
wozeparrot
ffc31a23f4 tk mi350 (#13288) 2025-11-25 15:49:44 -08:00