George Hotz
3169cb386d
remove graph [pr] ( #7085 )
2024-10-16 11:40:07 +08:00
George Hotz
38d45dfba5
hotfix: no rng in test/external/external_benchmark_schedule.py
2024-10-12 22:03:04 +08:00
George Hotz
e7a0ffe46a
break out linearization [pr] ( #6994 )
2024-10-11 15:27:33 +08:00
George Hotz
c08521e823
minor cleanups from toonygrad ( #6990 )
2024-10-11 14:19:10 +08:00
George Hotz
9dd9f71011
no global kernel stuff [run_process_replay] ( #6808 )
...
* use traceback instead of global metadata crap [run_process_replay]
* save the kernel
* correct, imports clean, no device
* UNPARENTED
* speed
* proudly unparented
* Update ops.py
* update tests for unparented
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-30 13:52:33 +08:00
qazal
d24e4b1042
viz more kernel view work ( #6659 )
2024-09-23 10:48:35 +08:00
George Hotz
282af21b95
hotfix: DEBUG_EXPAND -1 and NOOPT in benchmark schedule
2024-09-06 17:22:30 +08:00
George Hotz
72939901fc
hotfix: ebs print kernel names
2024-08-29 21:20:36 -07:00
George Hotz
365babe391
precompute early_reject [run_process_replay] ( #6327 )
...
* precompute early_reject [run_process_replay]
* features for ebs
* fix ocelot cache
2024-08-29 18:26:24 -07:00
George Hotz
26498b322e
add BEAM to external_benchmark_schedule.py
2024-08-23 18:10:46 -07:00
George Hotz
2c42e9c2c6
faster rewrite, no folder in expand/reduce [run_process_replay] ( #6216 )
...
* faster rewrite, no folder in expand/reduce [run_process_replay]
* is removing the expander there okay
* parens
* don't reconstruct exact match uop
* fast do_reduce
* expand pyint
* most of the parents gains with less lines
2024-08-20 23:36:58 -07:00
qazal
074cf780dd
add option to only benchmark schedule [run_process_replay] ( #6204 )
2024-08-20 16:51:27 +03:00
George Hotz
74ee9febec
remove iter from uopgraph ( #6110 )
...
* remove iter from uopgraph
* linearize returns uops
* fix tests
* linearize in linearize
* tests fix
* touchup
* test failures
2024-08-16 15:58:29 -07:00
qazal
28c75bf2a6
merge uops with ops ( #6111 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-08-16 18:17:57 -04:00
qazal
c23d44c779
AST is UOp ( #6030 )
...
* most of the work from the uops2 branch
* schedule
* realize
* kernel
* lowerer
* search
* green
* merge uops with ops
* Revert "merge uops with ops"
This reverts commit 1408a59f12 .
* fix benchmark
* remove extra dedup
2024-08-16 22:09:00 +03:00
George Hotz
fa7e734b49
MetaOps.KERNEL ( #5543 )
2024-07-17 19:41:23 -07:00
George Hotz
fb3011ac61
improve matcher speed [run_process_replay] ( #5438 )
...
* improve matcher speed [run_process_replay]
* don't use arg set in ptx
2024-07-12 20:02:19 -07:00
wozeparrot
b80fd7d23c
allow benchmarking forward only ( #5436 )
2024-07-12 17:37:49 -07:00
George Hotz
14189bca68
graph_dedup function [run_process_replay] ( #4955 )
2024-06-14 04:24:37 -07:00
George Hotz
508e8a6666
add cpu objdump to LLVM/CLANG ( #4537 )
2024-05-11 14:28:44 -07:00
George Hotz
328b083e66
lil profiling script
2024-05-11 11:02:44 -07:00