Commit Graph

11681 Commits

Author SHA1 Message Date
George Hotz
d84db5851f calls 2026-01-08 00:55:57 -08:00
George Hotz
37b4751958 isNAN 2026-01-08 00:24:10 -08:00
George Hotz
5e923ccb5e simpler ucode 2026-01-08 00:10:23 -08:00
George Hotz
10836a5dba lil cleanups 2026-01-07 23:49:53 -08:00
George Hotz
56ba96f5cd uops have types 2026-01-07 21:52:55 -08:00
George Hotz
c8b42edec6 a bunch of todos for my boy claude 2026-01-07 20:18:04 -08:00
chenyu
add569d94c test_unrealized_const_input_frozen (#14044)
unrealized const is not replaced in jit
2026-01-07 20:18:04 -08:00
nimlgen
e33d79226d amd: copies w/o sdma (#14036)
* amd: copies w/o sdma

* as_args

* fixes

* f
2026-01-07 20:18:04 -08:00
chenyu
caa52dcbe5 raise when jit fxn returns non-Tensor output (#14042) 2026-01-07 20:18:04 -08:00
chenyu
aa96d826f4 JitError (#14041)
* JitError

* test_symbolic_jit
2026-01-07 20:18:04 -08:00
chenyu
b4fd0954b7 test jit tolist failure (#14040)
also moved tests to test_jit_footguns
2026-01-07 20:18:04 -08:00
chenyu
02ab3eb153 test case for jit a function with item call (#14039)
* test case for jit a function with item call

output is silently wrong now

* no dtype
2026-01-07 20:18:04 -08:00
nimlgen
a6198a67fc mockdsp: use dsp allocator (#14037)
* mockdsp: use dsp allocator

* fix

* ?
2026-01-07 20:18:04 -08:00
wozeparrot
0069cd9a0b tk: support sliced local -> reg load (#14034) 2026-01-07 20:18:04 -08:00
George Hotz
947747eb5e Merge branch 'master' into asm_ucode 2026-01-06 00:16:03 -08:00
George Hotz
45f7fd073d assembly/amd: pcode bug fixes (#14032)
* bring over pcode parser

* fixes

* pdf test

* delay alu
2026-01-06 00:15:48 -08:00
wozeparrot
21d0f6bb76 tk: flat global -> local load (#14033) 2026-01-05 23:35:53 -08:00
qazal
3170365a5b visualize SQTT with the same cfg infrastructure (#13870)
* start

* rough sketch

* post render dag

* art

* intro g key

* work

* custom color scale

* colors

* more blue

* better

* smaller

* use for loop in test
2026-01-06 14:53:20 +09:00
George Hotz
640dac46c2 pcode_exec 2026-01-05 21:15:29 -08:00
Christopher Milan
0120d69caa autogen: avcodec (and simplify workflow) (#14031)
* simplify autogen workflow and add avcodec verification

- Consolidate all regeneration into single steps (delete + import)
- Remove continue-on-error and individual diff checks
- Use git diff at end to catch all differences
- Show artifact URL in failure message
- Add avcodec.py verification

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* patch avcodec

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 23:30:25 -05:00
George Hotz
05129e58b0 pcode back 2026-01-05 20:15:30 -08:00
George Hotz
b7dc59a68d fix emu 2026-01-05 20:09:20 -08:00
George Hotz
ec7ec99cbd better 2026-01-05 20:06:52 -08:00
George Hotz
c8c6346336 tests 2026-01-05 19:57:30 -08:00
George Hotz
6de310c87f parsing 2026-01-05 19:54:56 -08:00
George Hotz
ffba806b65 pdf/qcode work 2026-01-05 19:42:47 -08:00
George Hotz
c4016d5cac fix psuedocode parsing 2026-01-05 18:58:55 -08:00
George Hotz
a5587fbda1 Merge origin/master, delete pcode.py 2026-01-05 18:53:43 -08:00
George Hotz
20653d2996 assembly/amd: make pdf.py code shine (#14029)
* assembly/amd: make pdf.py code shine

* no merge

* pdf2 is the future

* something

* regen enums

* test

* work

* remove junk

* write

* pcode extraction

* pdf2 passes all tests

* simplify

* simpler pdf

* late filter

* remove hacks

* simplify pdf2.py

* field type

* remove defaults

* don't export srcenum

* simple pdf.py

* simpler

* cleaner

* less hack in PDF
2026-01-05 18:49:40 -08:00
qazal
ea7b149ca5 viz command line tool (#14030) 2026-01-06 10:19:47 +09:00
Christopher Milan
f86c728440 load libclang as 'libclang.so' too (#14028) 2026-01-05 16:56:16 -05:00
chenyu
eda6a73897 clean up canonicalize_device (#14027)
centralize the type check
2026-01-05 10:29:55 -05:00
chenyu
ce464b147a clean up comments that mentioned outdated terms (#14026)
no MultiLazyBuffer and no ShapeTracker in comments
2026-01-05 09:42:58 -05:00
chenyu
83063cc3e4 onnx TensorScatter (#14024) 2026-01-05 09:05:22 -05:00
chenyu
9497ec00f2 fix onnx attention permute (#14025)
* fix onnx attention permute

* skip test_attention_4d_fp16_cpu too
2026-01-05 08:58:50 -05:00
qazal
5cff5698f7 viz: g key toggles graph and text view (#14023) 2026-01-05 22:41:45 +09:00
chenyu
7a81a3cb98 more passed onnx tests (#14022) 2026-01-05 07:46:27 -05:00
kim yongjin
34fe105386 remove unused LazySeq (#14020) 2026-01-05 07:38:33 -05:00
qazal
4f2f38bf64 viz: split cfg and table render (#14021) 2026-01-05 20:59:08 +09:00
George Hotz
74da1c6310 test qcode 2026-01-05 02:42:27 -08:00
nimlgen
70405b4f3c am_smi: mi350 (#14018) 2026-01-05 13:10:56 +03:00
George Hotz
400d59c06b simpler 2026-01-04 20:37:06 -08:00
Christopher Milan
b2a0b9c551 autogen: dump patch in CI (#14010)
* autogen: don't fast-fail, produce patch artifact on differences

All verification steps now use continue-on-error to run completely.
Each job generates a patch artifact containing all differences found.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* add gen from header test

* fix tests

* fail if diff

* add forward decl autogen test

* remove confusing/wrong comments

* macos unittests set LIBCLANG_PATH

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-04 22:38:12 -05:00
chenyu
aae08b20e0 enable passed onnx tests (#14017) 2026-01-04 22:12:50 -05:00
George Hotz
57684d2777 no pcode 2026-01-04 18:35:16 -08:00
chenyu
785d04d127 simpler einsum (#14014) 2026-01-04 20:38:59 -05:00
George Hotz
8147a78d24 wide dtypes 2026-01-04 17:26:12 -08:00
George Hotz
486248f775 fix pcode 2026-01-04 17:04:52 -08:00
George Hotz
87e72f1540 ftz 2026-01-04 16:32:35 -08:00
chenyu
f6a78a29e0 support einsum trace (#14012)
* support einsum trace

* test_einsum_scalar_cpu
2026-01-04 19:27:27 -05:00