Commit Graph

11429 Commits

Author SHA1 Message Date
Christopher Milan
75a037c7d0 bump step time 2025-12-19 17:48:57 +00:00
Christopher Milan
f04b972df4 simplify ops_qcom 2025-12-19 17:29:27 +00:00
Christopher Milan
68502a0e88 mypy 2025-12-19 17:29:27 +00:00
Christopher Milan
7462edaffe ruff 2025-12-19 17:29:27 +00:00
Christopher Milan
083f58ee59 test_renderer_failures 2025-12-19 17:29:27 +00:00
Christopher Milan
44a1d0a1af cleanups 2025-12-19 17:29:27 +00:00
Christopher Milan
51106f9cf0 TestImageDType valid images 2025-12-19 17:29:27 +00:00
Christopher Milan
cef8eac987 TestImageRealization valid images 2025-12-19 17:29:27 +00:00
Christopher Milan
645c07cf8c TestImageCopy valid images 2025-12-19 17:29:27 +00:00
Christopher Milan
0fadb319b2 move undo 2025-12-19 17:29:27 +00:00
Christopher Milan
7133e60ce9 update compile test 2025-12-19 17:29:27 +00:00
Christopher Milan
8ef81d0eb9 mypy 2025-12-19 17:29:27 +00:00
Christopher Milan
b23eaa8b01 pad output 2025-12-19 17:29:27 +00:00
Christopher Milan
1c8084138b remove extra qcom checks 2025-12-19 17:29:27 +00:00
Christopher Milan
19ab0b8789 reshape properly 2025-12-19 17:29:27 +00:00
Christopher Milan
607afefda6 openpilot CL 2025-12-19 17:29:27 +00:00
Christopher Milan
bf7fb2309a padding 2025-12-19 17:29:27 +00:00
Christopher Milan
530eb6e682 mypy 2025-12-19 17:29:27 +00:00
Christopher Milan
89ed801aaf cl tiny_gemm (64) works 2025-12-19 17:29:27 +00:00
Christopher Milan
2704104178 remove image from BufferSpec 2025-12-19 17:29:27 +00:00
nimlgen
57fe4d0a59 am: no_update_ptr for master (#13757) 2025-12-19 19:37:37 +03:00
chenyu
7fcd3cf991 hotfix SPEC for AFTER(CONTIGUOUS) (#13752)
fixed spec error in `PYTHONPATH="." REWRITE_STACK_LIMIT=5000000 NULL=1 DEFAULT_FLOAT="HALF" BERT_LAYERS=2 BENCHMARK=10  BS=128 GPUS=1 MODEL=bert python3 examples/mlperf/model_train.py`
2025-12-19 10:05:45 -04:00
qazal
81b5815a66 viz: minimal data to render a graph (#13754) 2025-12-19 16:19:28 +08:00
Christopher Milan
849e46da21 DLL: _PATH variables can be parent dir (#13753) 2025-12-19 00:28:02 -05:00
qazal
159c0e92fa viz: infrastructure for basic block graphs (#13751) 2025-12-19 13:08:19 +08:00
George Hotz
fa40df972f fix tests for NV (#13744)
* small fix

* min diff

* bfloat16 out
2025-12-18 13:20:21 -04:00
nimlgen
77191fb744 hive_reset for mi350 (#13746) 2025-12-18 12:02:28 +03:00
nimlgen
ceff388f3d am: extend va space (#13745) 2025-12-18 11:20:43 +03:00
wozeparrot
99e667bdcd tk fa bwd (#13480) 2025-12-17 23:56:37 -08:00
George Hotz
aeb7516c8a tests passing on tinybox h3 (#13742) 2025-12-17 19:04:34 -04:00
chenyu
7cd7593c5d add script to train bert on mi350x (#13743)
adapted from mi300 config
2025-12-17 16:54:04 -05:00
George Hotz
22f3e7f995 better precommit coverage and faster (#13740)
* improve pre-commit hook speed and coverage

* remove a few

* lose that
2025-12-17 13:25:55 -04:00
George Hotz
bc78cf1197 filter warnings for nicer test output (#13739) 2025-12-17 13:25:27 -04:00
George Hotz
b013244c38 fix local tests for AMD_LLVM (#13738)
* fix local tests for AMD_LLVM

* fix linters

* skip that for now

* fix segfault
2025-12-17 12:23:46 -04:00
nimlgen
7081014c73 am_smi: mi300 (#13737)
* am_smi: mi300

* smi

* remo
2025-12-17 17:56:01 +03:00
George Hotz
3dbde178c1 mark slow tests as slow instead of as CI (#13736)
* mark slow tests as slow instead of as CI

* CI shouldn't have different behavior

* more skips / CI

* slow
2025-12-17 10:29:57 -04:00
George Hotz
9015a22523 make tests faster (#13734) 2025-12-17 09:39:44 -04:00
nimlgen
3eecb4f123 am: mi350 support (#13733) 2025-12-17 14:57:21 +03:00
wozeparrot
5151a341b3 tk: small changes from fa bwd (#13732) 2025-12-16 22:44:36 -08:00
chenyu
fda73c8180 support LAMB param offload (#13730)
also added Tensor.shard_like
2025-12-16 19:56:30 -05:00
George Hotz
cf0c28d5ae all tests pass on strix halo (#13728) 2025-12-16 19:35:50 -04:00
Christopher Milan
af1d938a50 DLL: search wsl lib folder (#13727) 2025-12-16 18:27:09 -05:00
George Hotz
0fb645cc4c move some methods to mixins (#13725)
* move some methods to mixins

* a few more

* math trunc
2025-12-16 19:20:04 -04:00
Christopher Milan
c6ba016da6 fix cuda check (#13726) 2025-12-16 18:00:09 -05:00
George Hotz
ee45669d14 pre extract afters + sched cleanups (#13720)
* pre extract afters + sched cleanups

* claude.md lesson

* tests for schedule cache

* Revert "tests for schedule cache"

This reverts commit fb3f2e800a.
2025-12-16 16:14:30 -04:00
George Hotz
4b741e893f remove REMOTE=1 (#13722)
* remove REMOTE=1

* leave ibverbs
2025-12-16 15:58:10 -04:00
George Hotz
4d8d821f56 create schedule before the cache (#13717)
* create schedule before the cache

* move create_schedule

* simpler

* simpler

* simpler
2025-12-16 14:15:31 -04:00
George Hotz
bfe374c7f5 support symbolic shapes in split/chunk when split dim is concrete (#13718)
* support symbolic shapes in split/chunk when split dim is concrete

Previously split() and chunk() required all dimensions to be concrete.
Now they only require the dimension being split to be concrete, allowing
them to work with tensors that have symbolic shapes in other dimensions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* update CLAUDE.md: add pre-commit and no-amend rules

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix dim resolution order in split/chunk

Ensure dim_sz is retrieved after dim is resolved, not before.
The previous one-liner evaluated self.shape[dim] with the original
unresolved dim value.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 13:55:06 -04:00
chenyu
e428fbfab6 verify dtype of llama model params (#13719) 2025-12-16 12:32:02 -05:00
George Hotz
e5a66ace80 multi custom kernel support (#13716)
* multi custom kernel support

* custom kernel xfrom

* works

* no SPEC=2 on ck

* panic

* touchups
2025-12-16 11:36:30 -04:00