chenyu
da61dea1b2
simple failed UOp sub symbolic test case ( #5894 )
2024-08-03 14:27:23 -04:00
Elias Wahl
937bf5fe12
better hparam ( #5891 )
2024-08-03 12:38:53 -04:00
qazal
37cc87ea75
save lines in the scheduler [run_process_replay] ( #5890 )
2024-08-03 14:20:11 +03:00
qazal
56ef9e453e
pad reduceops to the max of each dimension ( #5889 )
...
* early verify
* pad reduceops to the max of each dim
* remove the function
2024-08-03 14:03:30 +03:00
qazal
65fa86901a
indexing fusion 2 ( #5888 )
...
* arange fusion
* kernels that fuse
* tests
2024-08-03 13:13:39 +03:00
qazal
af59b2eea9
tests from the indexing fusion branch ( #5886 )
2024-08-03 11:56:48 +03:00
chenyu
a77eab89ca
UOp mod folding cleanup ( #5885 )
...
move patterns around and update comments
2024-08-02 22:56:32 -04:00
chenyu
d5de44340e
UOp add mod folding ( #5862 )
...
* UOp add mod folding
* that passes now
2024-08-02 18:31:46 -04:00
George Hotz
714d00f325
hotfix: median > mean for sampling clock jitter
2024-08-02 22:07:58 +00:00
George Hotz
7348c40d9d
sampling time sync (8700 lines) ( #5843 )
...
* sampling time sync
* jitter matrix
* comment
* pass mypy
* line count
2024-08-02 14:44:35 -07:00
chenyu
41bbd3f4c1
update UOp mod reduction patterns ( #5883 )
...
prepare generic mod folding, also some test changes from mod folding pr
2024-08-02 17:43:40 -04:00
wozeparrot
acadccf344
comma benchmark ( #5518 )
2024-08-02 14:36:54 -07:00
nimlgen
b4709d294a
hotfix: hcq profiler use mid point for deps flow ( #5882 )
...
* hcq profiler use mid point for deps
* fixes
* mypy
2024-08-02 23:53:10 +03:00
Elias Wahl
4a114756f6
New BERT dataloader ( #5881 )
...
* One file == One topic
* update test
* new dataloader
* update train script
* get index is faster
2024-08-02 15:12:23 -04:00
nimlgen
2777784b91
add dependency viewer to hcq profiler ( #5874 )
...
* hcq profiler support deps
* clean up
* cleaner
* cleanup
* revert this
* linter
* mypy
* add test
* sync is strange, need to take the end
* linter + test
2024-08-02 22:07:01 +03:00
George Hotz
23e8c39288
get program fields in __post_init__ [run_process_replay] ( #5878 )
...
* get program fields in __post_init__ [run_process_replay]
* remove print
2024-08-02 09:57:12 -07:00
qazal
8611fa6c99
apply opts.extra_matcher in process replay [run_process_replay] ( #5877 )
2024-08-02 18:07:58 +03:00
qazal
2a791f7924
fuzz uops is simpler with List[UOp] [run_process_replay] ( #5875 )
...
* remove from fuzz_uops
* update fuzz_uops.py
* add to realize.py
2024-08-02 17:28:15 +03:00
George Hotz
3995f1ddf1
move ops lds estimate to Program [run_process_replay] ( #5872 )
2024-08-01 19:12:07 -07:00
George Hotz
877e0b4ba0
define global only has the index [run_process_replay] ( #5869 )
...
* define global only has the index [run_process_replay]
* fix that linearizer test
* fix ptx
* stupid ptx fix
2024-08-01 19:01:15 -07:00
chenyu
f27f949a5d
Revert "revert some UOp IDIV bound ( #5863 )" ( #5871 )
...
This reverts commit 0c8d202348 .
2024-08-01 21:38:31 -04:00
chenyu
df138bc558
Revert "revert a mod pattern ( #5864 )" ( #5870 )
...
This reverts commit 5c8de2d044 .
2024-08-01 20:44:26 -04:00
chenyu
1b0314d9ef
Revert "remove one more UOp mod pattern ( #5865 )" ( #5868 )
...
This reverts commit b03b8e18c2 .
2024-08-01 20:28:35 -04:00
George Hotz
d73bc85ba9
UOpGraph not in renderer or Program [run_process_replay] ( #5867 )
...
* UOpGraph not in renderer or Program [run_process_replay]
* fix some tests
* fix ptx
2024-08-01 16:20:30 -07:00
chenyu
b392b8edc3
increase atol and rtol test_gemm_fp16 ( #5866 )
...
* increase atol and rtol test_gemm_fp16
made it pass with NOOPT which has larger accumulated error
* revert that
2024-08-01 19:09:58 -04:00
chenyu
b03b8e18c2
remove one more UOp mod pattern ( #5865 )
...
fixed UOP_IS_SYMBOLIC=1 test_failure_40
2024-08-01 18:29:04 -04:00
chenyu
5c8de2d044
revert a mod pattern ( #5864 )
...
fixed UOP_IS_SYMBOLIC=1 linearizer failure 47
2024-08-01 17:24:26 -04:00
nimlgen
34168a64e3
optimize nv profiler ( #5856 )
...
* nv profiler fix
* cleanup hcq a bit
* fixes
* fix
* typo
* all signals put timestamp
* a bit cleaner
* merge fields
* type
* import
* tiny fix
2024-08-01 23:57:45 +03:00
George Hotz
2d3c7e4d4e
some TestPickleJIT tests ( #5860 )
...
* some TestPickleJIT tests
* hotfix: print which opencl device we are using
2024-08-01 12:39:59 -07:00
George Hotz
e347f10d33
hotfix: print which opencl device we are using
2024-08-01 12:39:46 -07:00
chenyu
0c8d202348
revert some UOp IDIV bound ( #5863 )
...
* revert some UOp IDIV bound
breaks conv with UOP_IS_SYMBOLIC, added some conv tests in CI
* those are correct
* skip slow ones
2024-08-01 15:09:06 -04:00
George Hotz
53fcac9e80
hotfix: increase time on flaky NV test
2024-08-01 10:20:07 -07:00
qazal
cedf459843
infra for multi view reduce_info [run_process_replay] ( #5861 )
2024-08-01 19:46:55 +03:00
qazal
26d0265d66
test schedule of LazyBuffers [run_process_replay] ( #5859 )
2024-08-01 19:06:29 +03:00
George Hotz
0e34d83777
hotfix: don't include the old input_rawbuffers in all_resources
2024-08-01 09:00:11 -07:00
chenyu
d609206a4a
move UOp patterns around [run_process_replay] ( #5857 )
...
group lt / div / mod together and minor cleanups
2024-08-01 11:32:08 -04:00
qazal
3e95e2bb0b
mutate reduceop shapes pre ast creation [run_process_replay] ( #5855 )
2024-08-01 15:00:05 +03:00
qazal
ba0a0008aa
early update the reduceop axis [run_process_replay] ( #5854 )
2024-08-01 14:08:40 +03:00
David Hou
eb91423cb4
MLB support reshape for uneven shards ( #5804 )
...
* cleaner uneven reshape
* update test
2024-08-01 02:36:03 -07:00
David González Martínez
0f09b94c43
add failing test for second order derivatives ( #5772 )
...
* add failing test
* fix lint
* fix bad merge
* fix again
* fix test
* more minimal
2024-08-01 02:34:47 -07:00
George Hotz
9d05dfb6f4
move JIT graphing into CapturedJit ( #5852 )
...
* move JIT graphing into CapturedJit
* better
* _jit_cache
* clear inputs cleanup
* test_pickle_jit with graph + cleanup
* 0 is fine to start
* support None in bufs
* alloc real buffers
* cleaner
2024-07-31 20:48:17 -07:00
chenyu
0ec732b494
test lin fail 47 for UOP_IS_SYMBOLIC ( #5853 )
...
failed arange example with UOP_IS_SYMBOLIC
2024-07-31 23:09:22 -04:00
George Hotz
c6a8395f1b
CapturedJit is fun to pickle [run_process_replay] ( #5851 )
...
* CapturedJit is fun to pickle
* export input replace
2024-07-31 17:23:01 -07:00
George Hotz
5ff3e46718
diff symbolic with uops [run_process_replay] ( #5841 )
...
* diff symbolic with uops
* mergable symbolic diff
2024-07-31 15:15:01 -07:00
George Hotz
72621d9e7c
count the specials in uops [run_process_replay] ( #5848 )
...
* count the specials in uops [run_process_replay]
* cleanups
2024-07-31 14:53:18 -07:00
chenyu
c2ffcf6887
remove the wrong mod UOp pattern ( #5847 )
...
don't think we are hitting it because the stride construction, and it's wrong and not needed
2024-07-31 16:24:25 -04:00
qazal
8174c438a3
pad test_failure_45 ( #5846 )
2024-07-31 23:08:48 +03:00
George Hotz
8672a9db3f
add test to validate lazyops dims ( #5845 )
2024-07-31 12:59:38 -07:00
chenyu
4fe5b95568
fix UOp ALU bound ( #5844 )
...
* fix UOp ALU bound
root cause of resnet bug, the ALU bound is only correct for scalar, not vectorized
* it can be nan...
2024-07-31 15:19:31 -04:00
George Hotz
5eedd9e3ad
raise the line ceiling to 8600. USE LINES CAREFULLY
2024-07-31 09:56:39 -07:00