qazal
76079bc7f2
viz: pick the largest rect for proxy fillColor ( #11558 )
2025-08-07 16:40:17 +03:00
nimlgen
4f29a2c441
fix flaky test on macos ( #11557 )
2025-08-07 15:55:35 +03:00
qazal
b3f7ea6f93
viz: add support for colored tooltip text ( #11556 )
2025-08-07 15:04:43 +03:00
qazal
91ec093464
viz: align-center checkbox ( #11555 )
2025-08-07 14:22:02 +03:00
qazal
1e205775bd
viz: remove color for unbind step ( #11554 )
2025-08-07 14:16:21 +03:00
nimlgen
031f26632b
viz: timeline perf ( #11533 )
...
* viz: timeline perf
* progress
* fast
* less lines
* less lines
* less lines
* fix chrome
2025-08-07 13:16:17 +03:00
George Hotz
a1aa5670aa
Revert "fix mismatch reduce ( #11547 )" ( #11549 )
...
This reverts commit 49d21a9055 .
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055
fix mismatch reduce ( #11547 )
...
* fix mismatch reduce
* cleanups
* fix shape
* fix mypy
* resolve
2025-08-06 21:12:51 -07:00
George Hotz
21570545d3
move view pushing to codegen, try 2 ( #11534 )
...
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
2025-08-06 15:58:38 -07:00
wozeparrot
2d5bdc939d
faster llama3 dataloader ( #11540 )
2025-08-06 18:25:57 -04:00
George Hotz
80d9cced07
more test cleanups ( #11544 )
...
* more test cleanups
* revert that
2025-08-06 15:05:21 -07:00
George Hotz
6fd1332763
update some tests for less Kernel ( #11543 )
...
* update some tests for less Kernel
* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
09dc7af8e9
move bind to big graph ( #11539 )
...
* move bind to big graph
* fix tests
* unbind inside kernel only
* merge views
* fix multitensor
* failure text change
2025-08-06 13:27:51 -07:00
George Hotz
7c5e115747
test_mismatch_reduce ( #11538 )
2025-08-06 10:02:14 -07:00
George Hotz
4fe11725c6
pass through sink arg, update linearizer test ( #11536 )
...
* pass through sink arg, update linearizer test
* get_program help
* bump line count
* use new api
2025-08-06 09:48:48 -07:00
George Hotz
bfebb5c37b
do store in the replace_buffers ( #11535 )
2025-08-06 08:42:45 -07:00
geohotstan
1163292759
move onnx_parser into onnx ( #11530 )
2025-08-06 10:46:27 -04:00
George Hotz
7b16fadd87
load view late + simpler rewrite ( #11525 )
...
* add the load view later
* simpler replace buffers
* rewrite name
2025-08-06 06:55:11 -07:00
nimlgen
930d8dae0c
hcq: lazy prof signal allocation ( #11531 )
2025-08-06 15:28:11 +03:00
nimlgen
eafc7fda12
upd perfetto ( #11528 )
2025-08-06 14:00:34 +03:00
nimlgen
1afb290027
ci: fix runner in nv ( #11527 )
2025-08-06 10:38:04 +03:00
qazal
61dae0685c
viz: show total mem in tooltip ( #11526 )
2025-08-06 06:51:26 +03:00
George Hotz
cf66df0ea6
put load early to make pointers match ( #11524 )
2025-08-05 20:04:32 -07:00
George Hotz
92175626e3
prereqs: move views to codegen ( #11522 )
2025-08-05 19:27:58 -07:00
chenyu
c9225d22ce
only disable flaky test_jit_multidev_xfer ( #11523 )
2025-08-05 22:17:25 -04:00
George Hotz
f58fd3143d
cleanup fix_kernel ( #11520 )
...
* cleanup fix_kernel
* early load buffer
* early meta ops
* move those to fix_kernel_ops
* fix tests
* remote metal was flaky
* Revert "fix tests"
This reverts commit a27019383d .
* that hack broke things
* fine for ptx
2025-08-05 18:38:43 -07:00
George Hotz
067daee5be
pin torch to 2.7.1 ( #11519 )
2025-08-05 15:58:57 -07:00
George Hotz
b39f43c46a
optimize in rewrite, try 2 ( #11518 )
...
* changes
* fix test uops
* optimize in rewrite, try 2
2025-08-05 15:52:53 -07:00
George Hotz
07b0df0d86
hotfix: test tensor dims start at 1
2025-08-05 15:40:24 -07:00
George Hotz
4dabdf7c6d
Revert "optimize in rewrite ( #11516 )" ( #11517 )
...
This reverts commit 3b777a9e05 .
2025-08-05 15:39:07 -07:00
George Hotz
3b777a9e05
optimize in rewrite ( #11516 )
...
* changes
* fix test uops
* dim shouldn't be 0
* huh, why did that one not save
2025-08-05 15:33:26 -07:00
nimlgen
ec676eddfa
nv: move base address higher ( #11514 )
2025-08-05 22:42:53 +03:00
qazal
7703f8b805
viz: skip flops info if estimates is symbolic ( #11513 )
2025-08-05 22:12:52 +03:00
nimlgen
fc4e713d1c
jit graph split tests ( #11507 )
...
* jit graph split tests
* fix
* one more test
* more tests
* fix
* xm
* rmeote
2025-08-05 21:32:37 +03:00
George Hotz
c57fde51f9
move swizzler to opt ( #11509 )
2025-08-05 11:31:30 -07:00
chenyu
ace8e9a706
fix test_conv2d_winograd ( #11511 )
2025-08-05 12:15:46 -04:00
chenyu
223aaa0492
clean up more conv tests ( #11510 )
2025-08-05 12:15:30 -04:00
Garret Castro
76e62a1c23
extract conv layer test logic ( #11488 )
...
* refactor: extract conv layer test logic
* tuple is unnecessary
* integrate _test_conv logic into all conv tests
* fix linter, forgot dilation
* undo winograd extraction
adds too many if statements for a single case
2025-08-05 11:15:54 -04:00
b1tg
8b8bd6c534
make einsum generate same kernels ( #11508 )
...
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-05 11:12:52 -04:00
uuuvn
011ef8fa9d
Fix incorrect jit current batch devs reset ( #11505 )
...
`current_batch_devs = []` (in `flush_batch()`) happens between
`new_batched_devs = ...` and `current_batch_devs = new_batched_devs` =>
doesn't actually reset anything leading to things not jitting properly
which 2xs remote bert step time (should have similar effects on any
non-hcq backend)
2025-08-05 08:16:16 +03:00
chenyu
f02720ca2d
fix fuse gate_contiguous unique ( #11504 )
2025-08-04 23:43:31 -04:00
George Hotz
7f6acfb0d5
give define global and friends a shape ( #11502 )
...
* give define global and friends a shape
* ignore negative size
* ptx fix
2025-08-04 19:09:39 -07:00
chenyu
83385e7abc
update gradient src in ramp.py ( #11499 )
...
that's simplified now
2025-08-04 18:58:03 -04:00
qazal
846a2826ab
viz: remove TracingKey.fmt ( #11482 )
...
* viz: remove TracingKey.fmt
* remove from test too
2025-08-05 00:00:03 +03:00
chenyu
01d44e8f16
tiny reduce_gradient cleanup [pr] ( #11498 )
2025-08-04 16:12:53 -04:00
chenyu
8a11af01ed
remove broken paperswithcode links in doc ( #11497 )
2025-08-04 13:12:33 -04:00
leopf
4f0ee4e982
BPE tokenizer ( #11415 )
...
* BPE works
* refactor tok
* oops
* basic tests
* fix eval
* smaller diff
* fix error
* proper vocab decoding
* use regex for splitting
* escape ucatrange
* full compat
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-08-04 09:52:38 -07:00
b1tg
06af9f9236
fix double exception + add name,loc in error msg ( #11487 )
...
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-04 13:41:23 +03:00
nimlgen
4877aa965a
ast seems to probe nv as well ( #11494 )
2025-08-04 11:47:07 +03:00
chenyu
e0106b6b25
1/(x*c) -> (1/c)*(1/x) ( #11491 )
...
example: 2*(2*a).reciprocal() -> a.reciprocal()
# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00