Commit Graph

10417 Commits

Author SHA1 Message Date
qazal
76079bc7f2 viz: pick the largest rect for proxy fillColor (#11558) 2025-08-07 16:40:17 +03:00
nimlgen
4f29a2c441 fix flaky test on macos (#11557) 2025-08-07 15:55:35 +03:00
qazal
b3f7ea6f93 viz: add support for colored tooltip text (#11556) 2025-08-07 15:04:43 +03:00
qazal
91ec093464 viz: align-center checkbox (#11555) 2025-08-07 14:22:02 +03:00
qazal
1e205775bd viz: remove color for unbind step (#11554) 2025-08-07 14:16:21 +03:00
nimlgen
031f26632b viz: timeline perf (#11533)
* viz: timeline perf

* progress

* fast

* less lines

* less lines

* less lines

* fix chrome
2025-08-07 13:16:17 +03:00
George Hotz
a1aa5670aa Revert "fix mismatch reduce (#11547)" (#11549)
This reverts commit 49d21a9055.
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055 fix mismatch reduce (#11547)
* fix mismatch reduce

* cleanups

* fix shape

* fix mypy

* resolve
2025-08-06 21:12:51 -07:00
George Hotz
21570545d3 move view pushing to codegen, try 2 (#11534)
* move view pushing to codegen, try 2

* fix up some linearizer tests

* fix test search

* fix test schedule

* delete that test

* fix test arange

* fix a few tests

* update tests

* push views

* ebs cleanup

* fix local/reg

* test and lint

* fix more tests

* test cleanups

* skipped that one
2025-08-06 15:58:38 -07:00
wozeparrot
2d5bdc939d faster llama3 dataloader (#11540) 2025-08-06 18:25:57 -04:00
George Hotz
80d9cced07 more test cleanups (#11544)
* more test cleanups

* revert that
2025-08-06 15:05:21 -07:00
George Hotz
6fd1332763 update some tests for less Kernel (#11543)
* update some tests for less Kernel

* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
09dc7af8e9 move bind to big graph (#11539)
* move bind to big graph

* fix tests

* unbind inside kernel only

* merge views

* fix multitensor

* failure text change
2025-08-06 13:27:51 -07:00
George Hotz
7c5e115747 test_mismatch_reduce (#11538) 2025-08-06 10:02:14 -07:00
George Hotz
4fe11725c6 pass through sink arg, update linearizer test (#11536)
* pass through sink arg, update linearizer test

* get_program help

* bump line count

* use new api
2025-08-06 09:48:48 -07:00
George Hotz
bfebb5c37b do store in the replace_buffers (#11535) 2025-08-06 08:42:45 -07:00
geohotstan
1163292759 move onnx_parser into onnx (#11530) 2025-08-06 10:46:27 -04:00
George Hotz
7b16fadd87 load view late + simpler rewrite (#11525)
* add the load view later

* simpler replace buffers

* rewrite name
2025-08-06 06:55:11 -07:00
nimlgen
930d8dae0c hcq: lazy prof signal allocation (#11531) 2025-08-06 15:28:11 +03:00
nimlgen
eafc7fda12 upd perfetto (#11528) 2025-08-06 14:00:34 +03:00
nimlgen
1afb290027 ci: fix runner in nv (#11527) 2025-08-06 10:38:04 +03:00
qazal
61dae0685c viz: show total mem in tooltip (#11526) 2025-08-06 06:51:26 +03:00
George Hotz
cf66df0ea6 put load early to make pointers match (#11524) 2025-08-05 20:04:32 -07:00
George Hotz
92175626e3 prereqs: move views to codegen (#11522) 2025-08-05 19:27:58 -07:00
chenyu
c9225d22ce only disable flaky test_jit_multidev_xfer (#11523) 2025-08-05 22:17:25 -04:00
George Hotz
f58fd3143d cleanup fix_kernel (#11520)
* cleanup fix_kernel

* early load buffer

* early meta ops

* move those to fix_kernel_ops

* fix tests

* remote metal was flaky

* Revert "fix tests"

This reverts commit a27019383d.

* that hack broke things

* fine for ptx
2025-08-05 18:38:43 -07:00
George Hotz
067daee5be pin torch to 2.7.1 (#11519) 2025-08-05 15:58:57 -07:00
George Hotz
b39f43c46a optimize in rewrite, try 2 (#11518)
* changes

* fix test uops

* optimize in rewrite, try 2
2025-08-05 15:52:53 -07:00
George Hotz
07b0df0d86 hotfix: test tensor dims start at 1 2025-08-05 15:40:24 -07:00
George Hotz
4dabdf7c6d Revert "optimize in rewrite (#11516)" (#11517)
This reverts commit 3b777a9e05.
2025-08-05 15:39:07 -07:00
George Hotz
3b777a9e05 optimize in rewrite (#11516)
* changes

* fix test uops

* dim shouldn't be 0

* huh, why did that one not save
2025-08-05 15:33:26 -07:00
nimlgen
ec676eddfa nv: move base address higher (#11514) 2025-08-05 22:42:53 +03:00
qazal
7703f8b805 viz: skip flops info if estimates is symbolic (#11513) 2025-08-05 22:12:52 +03:00
nimlgen
fc4e713d1c jit graph split tests (#11507)
* jit graph split tests

* fix

* one more test

* more tests

* fix

* xm

* rmeote
2025-08-05 21:32:37 +03:00
George Hotz
c57fde51f9 move swizzler to opt (#11509) 2025-08-05 11:31:30 -07:00
chenyu
ace8e9a706 fix test_conv2d_winograd (#11511) 2025-08-05 12:15:46 -04:00
chenyu
223aaa0492 clean up more conv tests (#11510) 2025-08-05 12:15:30 -04:00
Garret Castro
76e62a1c23 extract conv layer test logic (#11488)
* refactor: extract conv layer test logic

* tuple is unnecessary

* integrate _test_conv logic into all conv tests

* fix linter, forgot dilation

* undo winograd extraction

adds too many if statements for a single case
2025-08-05 11:15:54 -04:00
b1tg
8b8bd6c534 make einsum generate same kernels (#11508)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-08-05 11:12:52 -04:00
uuuvn
011ef8fa9d Fix incorrect jit current batch devs reset (#11505)
`current_batch_devs = []` (in `flush_batch()`) happens between
`new_batched_devs = ...` and `current_batch_devs = new_batched_devs` =>
doesn't actually reset anything leading to things not jitting properly

which 2xs remote bert step time (should have similar effects on any
non-hcq backend)
2025-08-05 08:16:16 +03:00
chenyu
f02720ca2d fix fuse gate_contiguous unique (#11504) 2025-08-04 23:43:31 -04:00
George Hotz
7f6acfb0d5 give define global and friends a shape (#11502)
* give define global and friends a shape

* ignore negative size

* ptx fix
2025-08-04 19:09:39 -07:00
chenyu
83385e7abc update gradient src in ramp.py (#11499)
that's simplified now
2025-08-04 18:58:03 -04:00
qazal
846a2826ab viz: remove TracingKey.fmt (#11482)
* viz: remove TracingKey.fmt

* remove from test too
2025-08-05 00:00:03 +03:00
chenyu
01d44e8f16 tiny reduce_gradient cleanup [pr] (#11498) 2025-08-04 16:12:53 -04:00
chenyu
8a11af01ed remove broken paperswithcode links in doc (#11497) 2025-08-04 13:12:33 -04:00
leopf
4f0ee4e982 BPE tokenizer (#11415)
* BPE works

* refactor tok

* oops

* basic tests

* fix eval

* smaller diff

* fix error

* proper vocab decoding

* use regex for splitting

* escape ucatrange

* full compat

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-08-04 09:52:38 -07:00
b1tg
06af9f9236 fix double exception + add name,loc in error msg (#11487)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-08-04 13:41:23 +03:00
nimlgen
4877aa965a ast seems to probe nv as well (#11494) 2025-08-04 11:47:07 +03:00
chenyu
e0106b6b25 1/(x*c) -> (1/c)*(1/x) (#11491)
example: 2*(2*a).reciprocal() -> a.reciprocal()

# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00