George Hotz
d40673505f
new cloud is cloudy [pr] ( #7631 )
...
* new cloud is cloudy [pr]
* waste lines to add security
* safety, with speed and less lines
* timing and del
* lines
* cleanups
* restore CloudSession
* bump to 3.10
* quotes
* renderer security
2024-11-11 20:18:04 +08:00
chenyu
e7b18cf5c0
fix load_worlds filter_novariable ( #7564 )
...
filter based on "DEFINE_VAR" instead of "Variable". also added a unit test to make sure dataset includes image and variable kernels
2024-11-05 16:06:39 -05:00
chenyu
207bca6cea
set PAGE_SIZE=1 and generate new dataset ( #7559 )
...
13080 rows in total. both generating and loading this are pretty broken now. filters are wrong for example
2024-11-05 11:25:01 -05:00
George Hotz
6f93e91deb
hotfix: lower mnist threshold for non determinism
2024-11-03 11:05:12 +08:00
George Hotz
72a9ac27e9
support image dtype in cloud [pr] ( #7482 )
...
* support image dtype in cloud [pr]
* remove outdated osx hack
* unused imports
2024-11-02 23:54:27 +08:00
George Hotz
133fe81cc5
Revert "Revert "move up migrate + new gated fold ( #7403 )" ( #7406 )" ( #7407 )
...
* Revert "Revert "move up migrate + new gated fold (#7403 )" (#7406 )"
This reverts commit ea5654a9bc .
* test padded in emulation too
* bring back early folding
2024-10-30 23:25:45 +08:00
George Hotz
d9d4dd6756
faster ci [pr] ( #7348 )
2024-10-29 14:01:44 +08:00
George Hotz
a5e0f59e41
move autogen to different CI runner [pr] ( #7346 )
...
* move autogen to different CI runner [pr]
* balance a bit
* readme back there
* compile enet in autogen
2024-10-29 13:35:22 +08:00
George Hotz
f55c3dcff8
hotfix: bump ocelot
2024-10-29 12:46:24 +08:00
George Hotz
4fed358511
hotfix: timeouts to 20 minutes. better no stats update than a red x
2024-10-25 16:31:52 +08:00
chenyu
d4c94d0d32
disable llama 1 4gpu and 6gpu benchmark ( #7276 )
...
having llama3 4gpu and 6gpu should be good enough
2024-10-24 14:19:22 -04:00
chenyu
e6929f2402
RUN_PROCESS_REPLAY=0 on llama 70B and resnet training ( #7272 )
...
* RUN_PROCESS_REPLAY=0 on llama 70B and resnet training
also added a 15 minutes total timeout, this cannot grow indefinitely
* add a few more
* a few more just for NV
2024-10-24 12:09:54 -04:00
qazal
4cf7cca91a
delete fuzz_schedule [pr] ( #7144 )
2024-10-18 15:09:39 +03:00
George Hotz
9f4ca88218
hotfix: relax target pct for beautiful_mnist
2024-10-17 12:36:07 +08:00
chenyu
d12c87dc8e
use ubuntu-22.04 in CI ( #7068 )
...
ubuntu-latest points to 24.04 now, maybe it's this?
2024-10-15 09:44:59 -04:00
chenyu
fbaab30fe3
add timing to fuzz_linearizer ( #7056 )
...
and applied smaller FUZZ_MAX_SIZE. this is getting quite slow in CI
2024-10-14 11:57:41 -04:00
nimlgen
feb0bcb58b
qcom bench bind to perf cluster ( #6996 )
2024-10-11 12:21:52 +03:00
George Hotz
f50d0e0ee0
cloud device [pr] ( #6964 )
...
* first try at cloud device [pr]
* real separation
* we're free
* clang works
* unhappy with timeout
* better timeouts and free
* unrelated
* use http verbs + add test
* lines + better test
* fix DELETE
* shorter cloud
* split key
* fix sending renderer
* PTXRenderer serialization
* add sessions
* http.client
* minor timeout bump
* fix keep-alive
* inc server timeout
* real fix timeout
* that one too
2024-10-11 12:24:06 +08:00
nimlgen
f9d454aed5
correct kernargs alignment ( #6984 )
2024-10-11 00:06:28 +03:00
qazal
3724a66716
move test_viz to test/, prereq for tinygrad/viz [pr] ( #6972 )
2024-10-10 11:40:46 +03:00
qazal
b82023c97e
process replay cleanup to generic _pmap [pr] ( #6929 )
...
* process replay cleanup to generic _pmap [pr]
* delete `COMPARE_SCHEDULE`
2024-10-07 13:57:05 +08:00
George Hotz
0d6216aba1
bump the download cache ( #6896 )
2024-10-05 10:23:18 +08:00
George Hotz
0f28e93224
add pickle support for pattern matchers [run_process_replay] ( #6816 )
...
* add pickle support for pattern matchers [run_process_replay]
* cleaner and all
* no closures
* fix tests
* revert that
* final
* cleaner
* python 3.8 fix
* add round trip back
* this
* waste lines on this. that's the final line count
* max print better
* more targetted fix
* regrettably add 3.8 support
2024-09-30 21:54:46 +08:00
wozeparrot
2b899164c6
no numpy ( #6751 )
2024-09-26 16:40:18 +08:00
wozeparrot
c100f3d406
default threefry ( #6116 )
2024-09-25 17:45:13 +08:00
George Hotz
dd575da7ee
real minimum cstyle change ( #6709 )
...
* real minimum cstyle change
* make it match
* bring back DEFINE_GLOBAL store marking writable
* bump line count to 9800
* closer
* precompute don't render
* cast/bitcast too
* smem_align
* vectorize
* more pr match
* remove that test
* less PR diff
2024-09-25 12:40:46 +08:00
George Hotz
f45d178a55
hotfix: support JIT_BATCH_SIZE=0, make that the default
2024-09-25 10:36:04 +08:00
George Hotz
52e7f1c108
add new model CI
2024-09-25 10:23:06 +08:00
George Hotz
b0ffe2452b
bump line count to 9800
2024-09-25 09:15:30 +08:00
George Hotz
de259e3f09
hotfix: add compile3 to comma CI
2024-09-23 18:25:49 +08:00
qazal
e2d6e10ddf
hotfix: reset benchmarks cache for process replay ( #6671 )
2024-09-23 15:13:02 +08:00
chenyu
26ebb7cab4
don't use div_folding in lt_folding ( #6666 )
...
* don't use div_folding in lt_folding
valids 35 -> 13
* fails the same as before
2024-09-23 01:50:18 -04:00
chenyu
da5b741656
removed valid in openpilot conv ( #6619 )
...
35 valids left
2024-09-23 00:30:18 -04:00
chenyu
1923932339
canonicalize simplex lt ( #6658 )
...
(X := a0*x0 + a1*x1 + ...) > 0 is equivalent to x0 + x1 + ... > 0 if xi >= 0 and ai > 0 for ints
2024-09-22 23:04:47 -04:00
chenyu
5707503048
x//a<b -> x <a*b for positive a ( #6622 )
...
openpilot valids 47 -> 37
2024-09-20 04:38:47 -04:00
chenyu
b14c1bc417
UOps.RANGE is_increasing ( #6615 )
...
* UOps.RANGE is_increasing
283 -> 47 valids
* test
2024-09-20 03:14:52 -04:00
chenyu
036c2f5b26
validhack use the new style ge for upper bound valid ( #6612 )
...
also relaxed the bound check to check vmin/vmax instead just const.
valids 482 -> 283
2024-09-19 23:45:42 -04:00
George Hotz
a1a882b006
arange folding with new ge ( #6604 )
...
* arange folding with new ge
* bump allowed gated
* bump allowed speed
2024-09-19 18:01:28 +08:00
chenyu
d148a62f8d
more generic simplify_valid_image_load ( #6603 )
...
use graph_rewrite to simplify the expression with narrowed variables, and check boundry conditions on monotonically increasing function to drop valid.
2024-09-19 05:33:37 -04:00
chenyu
162ead02a9
remove LOAD where valid is an empty set ( #6579 )
...
356 -> 354 valids
2024-09-18 03:49:41 -04:00
chenyu
a72d51e277
brute force VALIDHACK matching ( #6575 )
...
* brute force VALIDHACK matching
* cleanup
* 9700
2024-09-18 01:59:50 -04:00
qazal
d8e5d5c663
move VIZ=1 tests to fuzzers ( #6574 )
2024-09-18 12:12:03 +08:00
George Hotz
67a03e72bb
remove expr_idxs [run_process_replay] ( #6567 )
...
* remove expr_idxs [run_process_replay]
* goodbye that test
2024-09-17 18:34:51 +08:00
chenyu
5fb877c78c
generic valid match criteria of #6552 ( #6558 )
...
455 -> 364 valids.
generalize `idx < image bound` to `idx < image bound + c` for some `c`
2024-09-17 02:40:36 -04:00
George Hotz
0ab06d5840
push geps through wmma ( #6559 )
...
* push geps through wmma
* update tests
2024-09-17 14:38:40 +08:00
chenyu
7c942418a1
other side of simple out of bound valid case ( #6552 )
...
462 -> 455
2024-09-16 23:57:15 -04:00
chenyu
aeaf7894a7
more generic version of #6548 ( #6549 )
...
x*(-1)<0 can be generalized to x*(-1)<c, 473 -> 462 valids
2024-09-16 23:17:16 -04:00
chenyu
596f41eb46
simple drop image valid case ( #6548 )
...
* simple drop image valid case
started unit test, 530 -> 473 valids
* cleanup
2024-09-16 22:54:07 -04:00
chenyu
798be6bb74
add gated read_image count in openpilot compile2 ( #6546 )
...
530 to go
2024-09-16 21:17:00 -04:00
George Hotz
cd90092f14
graph rewrite tests ( #6519 )
...
* more graph rewrite tests
* more complex test cases
* more tests
* more tests
* cleanups
* 9600 lines
* cleanups
2024-09-15 17:29:16 +08:00