4842 Commits

Author SHA1 Message Date
chenyu
efcb32f6a9 unique const when requires_grad is set to True (#14075)
* unique const when requires_grad is set to True

* fix pyrender
2026-01-08 16:30:45 -05:00
chenyu
b34c637767 support bfloat16 for CL (#14073) 2026-01-08 14:14:29 -05:00
Garret Castro
16b652302e skip bf16 test if not supported by device (#14070) 2026-01-08 13:37:24 -05:00
wozeparrot
027b935269 tk: fix grouped load store (#14035) 2026-01-07 22:38:02 -08:00
chenyu
3caa1e2c98 fix cast HALF with PYTHON backend (#14058) 2026-01-07 16:52:05 -05:00
chenyu
5f1ede7f7e clean up test_dtype (#14055)
use less lambda
2026-01-07 15:45:42 -05:00
chenyu
2833c5a54b few more jit tests with multi tensor inputs (#14047) 2026-01-06 22:05:22 -05:00
chenyu
72a3f78d19 jit includes tensor inputs in containers (#14043)
* jit includes tensor inputs in containers

* cleanup
2026-01-06 19:42:06 -05:00
chenyu
c714881832 don't allow jit input to be const (#14045)
* don't allow jit input to be unbuffered like const

* just const to fix multi

* fix rnnt
2026-01-06 18:15:22 -05:00
chenyu
a8896f28e1 test_unrealized_const_input_frozen (#14044)
unrealized const is not replaced in jit
2026-01-06 14:17:43 -05:00
nimlgen
325f4006ff amd: copies w/o sdma (#14036)
* amd: copies w/o sdma

* as_args

* fixes

* f
2026-01-06 21:15:58 +03:00
chenyu
7fb18f7e47 raise when jit fxn returns non-Tensor output (#14042) 2026-01-06 12:59:20 -05:00
chenyu
4491ec0c9e JitError (#14041)
* JitError

* test_symbolic_jit
2026-01-06 12:19:50 -05:00
chenyu
6ddddc68af test jit tolist failure (#14040)
also moved tests to test_jit_footguns
2026-01-06 11:16:57 -05:00
chenyu
b699b9f763 test case for jit a function with item call (#14039)
* test case for jit a function with item call

output is silently wrong now

* no dtype
2026-01-06 10:40:43 -05:00
qazal
3170365a5b visualize SQTT with the same cfg infrastructure (#13870)
* start

* rough sketch

* post render dag

* art

* intro g key

* work

* custom color scale

* colors

* more blue

* better

* smaller

* use for loop in test
2026-01-06 14:53:20 +09:00
chenyu
83063cc3e4 onnx TensorScatter (#14024) 2026-01-05 09:05:22 -05:00
chenyu
9497ec00f2 fix onnx attention permute (#14025)
* fix onnx attention permute

* skip test_attention_4d_fp16_cpu too
2026-01-05 08:58:50 -05:00
chenyu
7a81a3cb98 more passed onnx tests (#14022) 2026-01-05 07:46:27 -05:00
Christopher Milan
b2a0b9c551 autogen: dump patch in CI (#14010)
* autogen: don't fast-fail, produce patch artifact on differences

All verification steps now use continue-on-error to run completely.
Each job generates a patch artifact containing all differences found.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* add gen from header test

* fix tests

* fail if diff

* add forward decl autogen test

* remove confusing/wrong comments

* macos unittests set LIBCLANG_PATH

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-04 22:38:12 -05:00
chenyu
aae08b20e0 enable passed onnx tests (#14017) 2026-01-04 22:12:50 -05:00
chenyu
f6a78a29e0 support einsum trace (#14012)
* support einsum trace

* test_einsum_scalar_cpu
2026-01-04 19:27:27 -05:00
wozeparrot
f550f9204c fa: failing test for bwd jit (#14009)
* tk: failing test for bwd jit

* feat: mark expectedFailure

* clean: spaces
2026-01-04 16:57:43 -05:00
George Hotz
7abf4591ba use bitsize on dtype (#14011)
* use bitsize on dtype [pr]

* bitsize

* bitsize in js export, but might be wrong

* reverts

* revert that
2026-01-04 12:16:21 -08:00
chenyu
cfb8bf5814 faster image load (#13977)
sometimes image load does not need to init with NAN
2026-01-04 13:09:59 -05:00
qazal
bdb421f13e process_replay: passthrough sink arg for Ops.PROGRAM input (#14000) 2026-01-04 13:09:39 +09:00
chenyu
8003db2a28 test case of NOOP store load folding (#13997) 2026-01-03 14:39:26 -05:00
qazal
2cc64d71b0 simplify mi350x gemm / viz asm tests (#13984)
* mi350x gemm cleanup

* asm tests work

* simpler asm tests
2026-01-03 11:11:07 +09:00
Christopher Milan
9dc524536f IMAGE=1 creates "dynamic" images (#13769)
* remove image from BufferSpec

* cl tiny_gemm (64) works

* mypy

* padding

* openpilot CL

* reshape properly

* remove extra qcom checks

* pad output

* mypy

* update compile test

* move undo

* TestImageCopy valid images

* TestImageRealization valid images

* TestImageDType valid images

* cleanups

* test_renderer_failures

* ruff

* mypy

* simplify ops_qcom

* bump step time

* Revert "bump step time"

This reverts commit 75a037c7d0.

* "dynamic textures" are optional

* a start

* IMAGE=1 works, no FLOAT16

* fast but wrong

* mypy

* some fixes

* better

* works

* refactor

* oops
2026-01-02 16:22:39 -05:00
chenyu
2e2b5fed12 fix misspellings (#13976) 2026-01-02 10:37:38 -05:00
b1tg
a78fcc55a4 amd tc 1616128 (#13439)
* amd tc 1616128

* fix test

* remove hardcoded check in test
2026-01-02 09:01:05 -05:00
wozeparrot
ecbac8a338 tk: fa cleanups + causal test (#13963) 2026-01-01 18:05:00 -08:00
chenyu
af0392efea only set DiskDevice.size if it opens successfully (#13962) 2026-01-01 19:33:26 -05:00
chenyu
e036d6df89 properly fix DiskDevice reuse (#13961) 2026-01-01 18:08:23 -05:00
chenyu
cb7c76a3bd update test_fuzz_failure to not contruct full UOp (#13960) 2026-01-01 15:09:58 -05:00
chenyu
51398edf9c fix indirect import (#13958)
also deleted old external tests
2026-01-01 14:22:45 -05:00
chenyu
8e416df438 simpler InvalidType [pr] (#13957)
simpler singleton pattern
2026-01-01 13:55:51 -05:00
chenyu
4d5c4d256d update tqdm for edge case (#13956)
1.00kit/s and not 1000it/s for value 999.5
2026-01-01 11:37:26 -05:00
chenyu
ed222070f7 update xlog2 fp16 decomp to not use fp32 (#13955) 2026-01-01 11:18:29 -05:00
chenyu
c69470be52 fix test_symbolic_arange_sym_step (#13952) 2026-01-01 09:41:07 -05:00
chenyu
b91b46091c delete test_tensor_uop (#13951)
old test for shape tracker. also update tests that refer shapetracker

names
2026-01-01 09:25:05 -05:00
chenyu
17ef4af72c new ceildiv that fixed symbolic conv (#13944)
* new ceildiv that fixed symbolic conv

* smaller test case
2026-01-01 09:02:41 -05:00
haofei
526fd4ec71 Fix SVD rank‑1 Jacobi rotation when tau == 0 (#13945) 2026-01-01 00:30:18 -05:00
haofei
20777f30b9 Fix QR/SVD NaNs on zero/orthogonal inputs (#13943) 2025-12-31 23:40:09 -05:00
chenyu
52acadc160 consolidate IGNORE_OOB=0 tests (#13937)
add a new unit test file and add more cases
2025-12-31 15:24:20 -05:00
Christopher Milan
13973e4dea refactor image pitch (#13928) 2025-12-31 13:22:38 -05:00
George Hotz
b998a80b5d assembly/amd: split generated stuff into enum/ins (#13924) 2025-12-31 10:10:52 -05:00
nimlgen
25440f0f72 all2all (#13902)
* all2all

* um

* fix

* x

* um

* simler

* mypy

* fix

* t

* cmnts
2025-12-31 16:38:32 +03:00
George Hotz
0221b96761 assembly/amd: fix all ops tests (#13910)
* assembly/amd: fix all ops tests

* test_ops with smaller sizes

* ds store/load 2addr
2025-12-30 18:01:34 -05:00
George Hotz
efc99d0c55 assembly/amd: more refactors (#13907)
* assembly/amd: more refactors

* more refactors

* more refactors

* simpler emu

* generate.py

* regen all

* cleanups

* more

* work

* more readme

* lil
2025-12-30 16:13:24 -05:00