chenyu
|
c8dfd10257
|
ShapeTracker.real_strides -> is_expanded [pr] (#12579)
only keep the used part
|
2025-10-09 22:52:45 -04:00 |
|
George Hotz
|
32e9949052
|
rename lazydata to uop (#10698)
|
2025-06-08 08:42:22 -07:00 |
|
chenyu
|
c462162db8
|
update benchmark bert scripts with BS and ACC_DTYPE (#9826)
BS=16, ACC_DTYPE=half for tinybox, BS=128, ACC_DTYPE=float for mi300x
|
2025-04-10 02:06:02 -04:00 |
|
chenyu
|
43e4565148
|
weighted linear in external_benchmark_bert_matmuls (#9757)
include the linear to get qkv, and permute so that stride matches with the real run
|
2025-04-06 23:35:42 -04:00 |
|
chenyu
|
8a585dc5c1
|
benchmark script for matmuls in bert (#9752)
2 main matmuls in the bert layers. getting these to be fast makes bert fast
|
2025-04-06 19:34:25 +08:00 |
|