11640 Commits

Author SHA1 Message Date
qazal
389f01c7f4 viz: amdgpu assembly basic block graph (#13755) 2025-12-22 23:17:16 +08:00
George Hotz
df0f9d6860 add olmoe support to llm (#13792)
* add olmoe support to llm

* cleanups

* simpler

* clean

* fix mypy

* lil

* remove dumb assert
2025-12-22 10:41:35 -04:00
qazal
81d9053013 roc: cast to nullptr instead of changing header (#13801) 2025-12-22 22:34:06 +08:00
nimlgen
d299d30f2c am_smi: fix with new autogen (#13800) 2025-12-22 16:53:26 +03:00
nimlgen
f6bda6ae4e am: continue from saved state (#13799)
* am: gfx queue cont

* f

* reset

* f

* l
2025-12-22 15:55:07 +03:00
qazal
6237bd86f6 sqtt/pmc viz improvements (#13797) 2025-12-22 18:16:35 +09:00
Sitananda Prasad
3000b8d762 symbolic: add x ^ x -> 0 folding pattern (#13794) 2025-12-21 21:47:28 -04:00
chenyu
5cb827f7bf clean up can_lossless_cast and add missing pairs [p] (#13793) 2025-12-21 12:18:33 -05:00
George Hotz
75a6a03664 add qwen3 moe support to tinygrad.apps.llm (#13775)
* qwen moe works

* simple moe

* one test

* integration
2025-12-21 12:36:02 -04:00
chenyu
29ef0809bb can_safe_cast -> can_lossless_cast (#13789)
safe cast in numpy only means the result won't overflow, so lossless is more precise
2025-12-21 11:29:19 -05:00
chenyu
ed1fd7023b use getattr in dtype.truncate [pr] (#13788) 2025-12-21 11:05:43 -05:00
qazal
9839838fdd viz UOp layout cleanup (#13787)
* use the same names in server and client

* first layout args, then renderer args
2025-12-21 22:11:40 +08:00
nimlgen
e523971028 am: make mqd contig (#13786) 2025-12-21 17:00:33 +03:00
qazal
09e060eab5 simplify viz node labels (#13784) 2025-12-21 16:45:06 +08:00
qazal
dc660c9fc0 remove stale / untested viz related files (#13785) 2025-12-21 16:42:48 +08:00
George Hotz
59c02dd87f does this fix the dtype test? (#13779)
* does this fix the dtype test?

* simpler
2025-12-20 17:31:46 -04:00
George Hotz
5228f7bd06 hotfix: opencode should not reformat files 2025-12-20 15:55:29 -04:00
chenyu
733ef0452c update test_uop_resolve (#13777)
plain @unittest.expectedFailure is too broad
2025-12-20 12:40:59 -05:00
nimlgen
3db2104fb8 am: timeout sos start (#13776) 2025-12-20 17:41:33 +03:00
qazal
94f97f6988 generic viz cleanups from the basic blocks branch (#13774)
* simpler codeblock highlight

* simpler append

* status enum
2025-12-20 18:18:03 +08:00
George Hotz
a987a8ed44 add neg VIZ support to not start server (#13772) 2025-12-20 00:36:38 -04:00
qazal
b7c2f0dd1b remove stale extra/sched directory (#13770) 2025-12-20 11:57:30 +08:00
George Hotz
86cd1e9e81 remove UPatAny for typing fix [pr] (#13766)
* remove UPatAny for typing fix [pr]

* fix dtype
2025-12-19 17:41:18 -04:00
George Hotz
4702da41d5 hotfix: mkdir for extra/disassemblers 2025-12-19 17:18:37 -04:00
George Hotz
45c459848d remove more stale stuff (#13765)
* remove more stale stuff

* remove disassemblers/adreno

* stale
2025-12-19 17:14:56 -04:00
George Hotz
744af193f0 remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
George Hotz
df6cde8a00 cleanup stale examples/extra (#13764)
* cleanup stale files

* examples

* move those back

* old

* delete more
2025-12-19 16:27:37 -04:00
chenyu
80b84f5267 ruff lint tinykitten (#13762)
deleted used import and double spaces. a few ignore to not change the real code
2025-12-19 14:31:00 -05:00
Christopher Milan
97103831c5 Revert "remove image from BufferSpec (#13636)" (#13761)
This reverts commit 2571a1eb47.
2025-12-19 13:54:36 -05:00
Christopher Milan
2571a1eb47 remove image from BufferSpec (#13636)
* remove image from BufferSpec

* cl tiny_gemm (64) works

* mypy

* padding

* openpilot CL

* reshape properly

* remove extra qcom checks

* pad output

* mypy

* update compile test

* move undo

* TestImageCopy valid images

* TestImageRealization valid images

* TestImageDType valid images

* cleanups

* test_renderer_failures

* ruff

* mypy

* simplify ops_qcom

* bump step time
2025-12-19 13:41:20 -05:00
chenyu
185a000882 gradient of COPY (#13760) 2025-12-19 13:33:59 -05:00
nimlgen
57fe4d0a59 am: no_update_ptr for master (#13757) 2025-12-19 19:37:37 +03:00
chenyu
7fcd3cf991 hotfix SPEC for AFTER(CONTIGUOUS) (#13752)
fixed spec error in `PYTHONPATH="." REWRITE_STACK_LIMIT=5000000 NULL=1 DEFAULT_FLOAT="HALF" BERT_LAYERS=2 BENCHMARK=10  BS=128 GPUS=1 MODEL=bert python3 examples/mlperf/model_train.py`
2025-12-19 10:05:45 -04:00
qazal
81b5815a66 viz: minimal data to render a graph (#13754) 2025-12-19 16:19:28 +08:00
Christopher Milan
849e46da21 DLL: _PATH variables can be parent dir (#13753) 2025-12-19 00:28:02 -05:00
qazal
159c0e92fa viz: infrastructure for basic block graphs (#13751) 2025-12-19 13:08:19 +08:00
George Hotz
fa40df972f fix tests for NV (#13744)
* small fix

* min diff

* bfloat16 out
2025-12-18 13:20:21 -04:00
nimlgen
77191fb744 hive_reset for mi350 (#13746) 2025-12-18 12:02:28 +03:00
nimlgen
ceff388f3d am: extend va space (#13745) 2025-12-18 11:20:43 +03:00
wozeparrot
99e667bdcd tk fa bwd (#13480) 2025-12-17 23:56:37 -08:00
George Hotz
aeb7516c8a tests passing on tinybox h3 (#13742) 2025-12-17 19:04:34 -04:00
chenyu
7cd7593c5d add script to train bert on mi350x (#13743)
adapted from mi300 config
2025-12-17 16:54:04 -05:00
George Hotz
22f3e7f995 better precommit coverage and faster (#13740)
* improve pre-commit hook speed and coverage

* remove a few

* lose that
2025-12-17 13:25:55 -04:00
George Hotz
bc78cf1197 filter warnings for nicer test output (#13739) 2025-12-17 13:25:27 -04:00
George Hotz
b013244c38 fix local tests for AMD_LLVM (#13738)
* fix local tests for AMD_LLVM

* fix linters

* skip that for now

* fix segfault
2025-12-17 12:23:46 -04:00
nimlgen
7081014c73 am_smi: mi300 (#13737)
* am_smi: mi300

* smi

* remo
2025-12-17 17:56:01 +03:00
George Hotz
3dbde178c1 mark slow tests as slow instead of as CI (#13736)
* mark slow tests as slow instead of as CI

* CI shouldn't have different behavior

* more skips / CI

* slow
2025-12-17 10:29:57 -04:00
George Hotz
9015a22523 make tests faster (#13734) 2025-12-17 09:39:44 -04:00
nimlgen
3eecb4f123 am: mi350 support (#13733) 2025-12-17 14:57:21 +03:00
wozeparrot
5151a341b3 tk: small changes from fa bwd (#13732) 2025-12-16 22:44:36 -08:00