qazal
d917895569
map out rangeify errors in test_schedule ( #12211 )
...
* map out rangeify errors in test_schedule
* skip that
* add to ci
2025-09-17 09:10:28 +03:00
chenyu
5b12764b83
add arange cat arange test ( #12217 )
...
simple test case to catch wrong reduce const folding. also clean up the old arange complexity test
2025-09-16 17:12:32 -04:00
chenyu
494bb12500
skip slow cifar bf16 on red benchmark ( #12213 )
...
very slow to compile the fake bf16
2025-09-16 14:55:01 -04:00
chenyu
419e997187
increase benchmark timeout ( #12212 )
...
account for compile cache, and it's annoying that job died due to timeout also messes the machine
2025-09-16 14:09:02 -04:00
chenyu
e555748807
test rangeify const folding ( #12200 )
...
* test rangeify const folding
reduce i know how to fix, multi and test_cast_padded tbd
* test_instancenorm_3d is very slow
2025-09-15 20:03:48 -04:00
chenyu
f732f66709
rangeify test_nn almost pass ( #12198 )
...
* rangeify test_nn almost pass
* issue with jit
* flaky
2025-09-15 17:49:20 -04:00
chenyu
82e037aad5
ci test.yml updates ( #12197 )
...
* ci test.yml updates
move docs together and external_benchmark_schedule to unit
* torch
2025-09-15 17:09:02 -04:00
chenyu
146c31586d
split RANGEIFY ci ( #12196 )
...
one CPU and one CL for speed
2025-09-15 15:41:10 -04:00
chenyu
72e010d816
fix rangeify ci ( #12189 )
...
CL=1, and multitensor needs to test with CPU since CL does not support multi in CI
2025-09-15 10:24:57 -04:00
qazal
f1bd06134d
test fuse with RANGEIFY=2 ( #12187 )
2025-09-15 15:51:23 +03:00
qazal
a388d2cb1a
remove PROFILE=1 option, it's just VIZ=1 [pr] ( #12176 )
...
* remove PROFILE=1 option, it's just VIZ=1 [pr]
* sqtt
* sqtt 2
* return last
* rename
2025-09-15 12:51:50 +03:00
George Hotz
d5bc27797b
fix some multitensor on rangeify ( #12162 )
...
* fix some multitensor on rangeify
* rangeify multi hacks
* copy on const
2025-09-14 14:31:57 +08:00
chenyu
aac3dceaf6
merge two PYTHON backend ci job ( #12143 )
...
* merge two PYTHON backend ci job
and mark anything that takes > 10 in test_ops slow
* two more
2025-09-12 17:36:46 -04:00
George Hotz
a2f502b89e
fix rangeify=1 ops on GPU ( #12130 )
2025-09-12 11:17:37 +08:00
chenyu
e5ef9ec5b1
remove IGNORE_OOB=0 in ci tests ( #12117 )
2025-09-11 15:05:04 -04:00
chenyu
520e2e0727
actually run unit tests in ci MacOS (unit) ( #12122 )
...
* actually run unit tests in ci MacOS (unit)
* that's always wrong
2025-09-11 13:32:30 -04:00
nimlgen
acb700fc26
ci: fix ptx env ( #12120 )
2025-09-11 12:42:15 -04:00
chenyu
20cd7177de
delete test_bert_fuse_arange ( #12121 )
...
* delete test_bert_fuse_arange
it's the default now and we are not interested in FUSE_ARANGE=0 version
* remove -v
2025-09-11 12:35:51 -04:00
chenyu
b07f962058
split metal model tests ( #12119 )
...
* split metal model tests
* llama too
2025-09-11 12:20:12 -04:00
chenyu
66593f135f
remove duplicated test_real_world ( #12118 )
...
included in the test/models right below
2025-09-11 11:57:14 -04:00
nimlgen
400ad93892
ci: gate boost paths for macos only ( #12114 )
2025-09-11 12:48:34 +03:00
chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00
nimlgen
fb96394ff5
auto-select available compilers ( #12094 )
...
* device: auto select compilers
* fix
* metal+opencl
* nv/cuda
* test without ptx
* ptx
* fix tests
* fix
* fix test
* rename
* test + cleaner
* xx
* ops
* better test
* win?
* um?
* types
* debug
* win??
* sep rung
* wtf?
* debug
* skip win
* revert this
* types
2025-09-10 19:52:01 +03:00
Sieds Lykles
5b73076e48
assert benchmark times ( #12042 )
...
* assert jitted times in openpilot
* better error
* better error
* add ASSERT_MIN_STEP_TIME to more models
* t is step_times
* update benchmark times
* update times
2025-09-09 23:40:02 +02:00
nimlgen
1c6c42715f
unify cpu and llvm ( #11982 )
...
* try unify cpu and llvm
* fixes
* fix
* ops
* no llvm
* fix
* rm
* lvmm is ot
* oops
* override
* no llvm
* ignore
* skip llvm
* ooops
2025-09-09 13:54:44 +03:00
chenyu
2bd1fff79c
ci GPU misc cleanups ( #12078 )
2025-09-08 16:47:29 -04:00
chenyu
1781d5bced
remove PYTHONPATH in test.yml ( #12077 )
...
set globally already
2025-09-08 15:41:47 -04:00
chenyu
11213398b9
reorder amdremote in test yml ( #12073 )
2025-09-08 13:43:04 -04:00
nimlgen
10ac427aaa
cpu threading ( #11951 )
...
* start cpu threading
* fix
* fix2
* fix
* hacks?
* threads
* minor
* no dsp
* dsp 2
* n
* more
* test
* xm
* cleaner
* readable
* f
* reorder
* when no threads
* rangeify
* typos
* not needed
* reapply
* remoev this
* linter
* fixed cpu count in ci
* fix
* fixes
* rm
* typo
* sort based on speed
* test if test works in ci
* Revert "test if test works in ci"
This reverts commit 1f05edb531 .
* do not pad thread
2025-09-06 16:13:43 +03:00
Jordan Chalupka
48ec5efad9
only run autogen tests on change ( #12049 )
...
* only run autogen tests on change
* example change
* rm example change
2025-09-05 23:53:01 -07:00
George Hotz
0123c394e5
early simplfy_merge_adjacent ( #12045 )
...
* do simplify_merge_adjacent before schedule
* do simplify_merge_adjacent before schedule
* disable that slow test
2025-09-05 16:39:20 -07:00
George Hotz
38dcadf07b
delete kernel.py ( #12040 )
...
* delete kernel.py
* delete that file
* rip and tear
* don't test search
* imports
* fix torch frontend
* not a part of regen
2025-09-05 15:52:07 -07:00
George Hotz
433581f8ed
make POSTOPT=2 the default ( #12034 )
...
* make POSTOPT=2 the default
* more matching tc
* fix winograd
* fix that test
* add matvec to Scheduler
* flip tc sort order
* similar speed
* fix beam on image
* disable slow tests
* slow
2025-09-05 14:34:05 -07:00
chenyu
a340723bf1
SKIP_SLOW_TEST=1 for nv CI ( #12031 )
2025-09-05 11:52:02 -04:00
chenyu
ce7163e9b4
clean up skip slow tests in PYTHON ( #12028 )
...
skip with SKIP_SLOW_TEST and decorators
2025-09-05 11:35:26 -04:00
chenyu
5dcc4c7f1b
skip test_linalg in windows unit test ( #12030 )
2025-09-05 11:28:40 -04:00
chenyu
677220ae7e
test_tesnor_data to unit/ ( #12013 )
2025-09-04 19:58:27 -04:00
George Hotz
560df206cc
split tc test ( #12003 )
...
* split tc test
* split hand coded opts
* remove some skipped tests
* skips on emulated
2025-09-04 11:47:56 -07:00
George Hotz
9dee724fc4
make EMULATE a context var ( #12002 )
...
* make EMULATE a context var
* fix test amx
2025-09-04 11:15:43 -07:00
chenyu
ca7574cb2d
ci set PYTHONPATH for all ( #11997 )
2025-09-04 10:06:04 -04:00
George Hotz
5cf42dc4db
add Scheduler to replace Kernel with POSTOPT=2 ( #11924 )
...
* ** simple kernel to replace Kernel for postopt
* support old
* fix beam
* beaming
* beam on old
* bring tensor cores back
* raise
* postbeam
* test ops passes on mac
* skip that
* postopt default
* gate that
* fix tensor cores
* a few test fixes
* dsp fix
* tc fix
* loop
* support swap
* test_gemv
* fix beam for variable
* test opts from high level stuff
* range annoying
* compile slow
* metal slow
* better beam
* no POSTBEAM
* fix nolocals
* hc opt mostly works
* put that back
* lil
* some work
* fix that
* POSTOPT 2
* fix tests
* no postopt 2
* work
* back
* padded tensors cores
* shift_to
* postopt 0 passes?
* write PADTO
* fix padded tensor cores
* compare hcopt
* 18000 lines
* should pass tests
* fix rangeify
* put types back
2025-09-03 19:23:30 -07:00
chenyu
e921fb44ee
clean up testnvidia env ( #11969 )
2025-09-02 18:29:00 -04:00
nimlgen
897254ad6c
ci: add dev<->cpu copy speeds ( #11959 )
2025-09-02 15:22:44 +03:00
nimlgen
a4f05ebd1a
ci: rebuild gpuocelot with boost libs ( #11920 )
2025-08-30 17:24:19 +03:00
nimlgen
cf9d8c8142
ci: pin boost for macos runners ( #11910 )
2025-08-30 01:38:06 +03:00
nimlgen
e8289c75b1
ci: do not reinstall existing pkgs in macos ( #11900 )
2025-08-28 21:20:15 +03:00
chenyu
134cf56904
update cache name for gpuocelot ( #11896 )
2025-08-28 13:11:10 -04:00
Jordan Chalupka
4785cd959a
[TYPED=1] cvar should allow dtype as a tuple ( #11770 )
...
* cvar dtype:DType|tuple[DType, ...]|None=None
* fmt
* add a test
* list typeguard as a dep for CI
* extra step to install mypy
* fix venv
* ci fixes
* mv typeguard to testing install group
* simpler TYPED=1 test
* add typeguard to lint group
2025-08-26 12:49:51 -04:00
George Hotz
66e9d54eed
RANGEIFY=2 is partial contig ( #11777 )
2025-08-21 16:53:58 -07:00
George Hotz
5954a0975f
fix some assigns on rangeify ( #11774 )
...
* fix some assigns
* llvm test
* more tests
* upd test
2025-08-21 15:15:54 -07:00