Commit Graph

1088 Commits

Author SHA1 Message Date
George Hotz
d40673505f new cloud is cloudy [pr] (#7631)
* new cloud is cloudy [pr]

* waste lines to add security

* safety, with speed and less lines

* timing and del

* lines

* cleanups

* restore CloudSession

* bump to 3.10

* quotes

* renderer security
2024-11-11 20:18:04 +08:00
chenyu
e7b18cf5c0 fix load_worlds filter_novariable (#7564)
filter based on "DEFINE_VAR" instead of "Variable". also added a unit test to make sure dataset includes image and variable kernels
2024-11-05 16:06:39 -05:00
chenyu
207bca6cea set PAGE_SIZE=1 and generate new dataset (#7559)
13080 rows in total. both generating and loading this are pretty broken now. filters are wrong for example
2024-11-05 11:25:01 -05:00
George Hotz
6f93e91deb hotfix: lower mnist threshold for non determinism 2024-11-03 11:05:12 +08:00
George Hotz
72a9ac27e9 support image dtype in cloud [pr] (#7482)
* support image dtype in cloud [pr]

* remove outdated osx hack

* unused imports
2024-11-02 23:54:27 +08:00
George Hotz
133fe81cc5 Revert "Revert "move up migrate + new gated fold (#7403)" (#7406)" (#7407)
* Revert "Revert "move up migrate + new gated fold (#7403)" (#7406)"

This reverts commit ea5654a9bc.

* test padded in emulation too

* bring back early folding
2024-10-30 23:25:45 +08:00
George Hotz
d9d4dd6756 faster ci [pr] (#7348) 2024-10-29 14:01:44 +08:00
George Hotz
a5e0f59e41 move autogen to different CI runner [pr] (#7346)
* move autogen to different CI runner [pr]

* balance a bit

* readme back there

* compile enet in autogen
2024-10-29 13:35:22 +08:00
George Hotz
f55c3dcff8 hotfix: bump ocelot 2024-10-29 12:46:24 +08:00
George Hotz
4fed358511 hotfix: timeouts to 20 minutes. better no stats update than a red x 2024-10-25 16:31:52 +08:00
chenyu
d4c94d0d32 disable llama 1 4gpu and 6gpu benchmark (#7276)
having llama3 4gpu and 6gpu should be good enough
2024-10-24 14:19:22 -04:00
chenyu
e6929f2402 RUN_PROCESS_REPLAY=0 on llama 70B and resnet training (#7272)
* RUN_PROCESS_REPLAY=0 on llama 70B and resnet training

also added a 15 minutes total timeout, this cannot grow indefinitely

* add a few more

* a few more just for NV
2024-10-24 12:09:54 -04:00
qazal
4cf7cca91a delete fuzz_schedule [pr] (#7144) 2024-10-18 15:09:39 +03:00
George Hotz
9f4ca88218 hotfix: relax target pct for beautiful_mnist 2024-10-17 12:36:07 +08:00
chenyu
d12c87dc8e use ubuntu-22.04 in CI (#7068)
ubuntu-latest points to 24.04 now, maybe it's this?
2024-10-15 09:44:59 -04:00
chenyu
fbaab30fe3 add timing to fuzz_linearizer (#7056)
and applied smaller FUZZ_MAX_SIZE. this is getting quite slow in CI
2024-10-14 11:57:41 -04:00
nimlgen
feb0bcb58b qcom bench bind to perf cluster (#6996) 2024-10-11 12:21:52 +03:00
George Hotz
f50d0e0ee0 cloud device [pr] (#6964)
* first try at cloud device [pr]

* real separation

* we're free

* clang works

* unhappy with timeout

* better timeouts and free

* unrelated

* use http verbs + add test

* lines + better test

* fix DELETE

* shorter cloud

* split key

* fix sending renderer

* PTXRenderer serialization

* add sessions

* http.client

* minor timeout bump

* fix keep-alive

* inc server timeout

* real fix timeout

* that one too
2024-10-11 12:24:06 +08:00
nimlgen
f9d454aed5 correct kernargs alignment (#6984) 2024-10-11 00:06:28 +03:00
qazal
3724a66716 move test_viz to test/, prereq for tinygrad/viz [pr] (#6972) 2024-10-10 11:40:46 +03:00
qazal
b82023c97e process replay cleanup to generic _pmap [pr] (#6929)
* process replay cleanup to generic _pmap [pr]

* delete `COMPARE_SCHEDULE`
2024-10-07 13:57:05 +08:00
George Hotz
0d6216aba1 bump the download cache (#6896) 2024-10-05 10:23:18 +08:00
George Hotz
0f28e93224 add pickle support for pattern matchers [run_process_replay] (#6816)
* add pickle support for pattern matchers [run_process_replay]

* cleaner and all

* no closures

* fix tests

* revert that

* final

* cleaner

* python 3.8 fix

* add round trip back

* this

* waste lines on this. that's the final line count

* max print better

* more targetted fix

* regrettably add 3.8 support
2024-09-30 21:54:46 +08:00
wozeparrot
2b899164c6 no numpy (#6751) 2024-09-26 16:40:18 +08:00
wozeparrot
c100f3d406 default threefry (#6116) 2024-09-25 17:45:13 +08:00
George Hotz
dd575da7ee real minimum cstyle change (#6709)
* real minimum cstyle change

* make it match

* bring back DEFINE_GLOBAL store marking writable

* bump line count to 9800

* closer

* precompute don't render

* cast/bitcast too

* smem_align

* vectorize

* more pr match

* remove that test

* less PR diff
2024-09-25 12:40:46 +08:00
George Hotz
f45d178a55 hotfix: support JIT_BATCH_SIZE=0, make that the default 2024-09-25 10:36:04 +08:00
George Hotz
52e7f1c108 add new model CI 2024-09-25 10:23:06 +08:00
George Hotz
b0ffe2452b bump line count to 9800 2024-09-25 09:15:30 +08:00
George Hotz
de259e3f09 hotfix: add compile3 to comma CI 2024-09-23 18:25:49 +08:00
qazal
e2d6e10ddf hotfix: reset benchmarks cache for process replay (#6671) 2024-09-23 15:13:02 +08:00
chenyu
26ebb7cab4 don't use div_folding in lt_folding (#6666)
* don't use div_folding in lt_folding

valids 35 -> 13

* fails the same as before
2024-09-23 01:50:18 -04:00
chenyu
da5b741656 removed valid in openpilot conv (#6619)
35 valids left
2024-09-23 00:30:18 -04:00
chenyu
1923932339 canonicalize simplex lt (#6658)
(X := a0*x0 + a1*x1 + ...) > 0 is equivalent to x0 + x1 + ... > 0 if xi >= 0 and ai > 0 for ints
2024-09-22 23:04:47 -04:00
chenyu
5707503048 x//a<b -> x <a*b for positive a (#6622)
openpilot valids 47 -> 37
2024-09-20 04:38:47 -04:00
chenyu
b14c1bc417 UOps.RANGE is_increasing (#6615)
* UOps.RANGE is_increasing

283 -> 47 valids

* test
2024-09-20 03:14:52 -04:00
chenyu
036c2f5b26 validhack use the new style ge for upper bound valid (#6612)
also relaxed the bound check to check vmin/vmax instead just const.
valids 482 -> 283
2024-09-19 23:45:42 -04:00
George Hotz
a1a882b006 arange folding with new ge (#6604)
* arange folding with new ge

* bump allowed gated

* bump allowed speed
2024-09-19 18:01:28 +08:00
chenyu
d148a62f8d more generic simplify_valid_image_load (#6603)
use graph_rewrite to simplify the expression with narrowed variables, and check boundry conditions on monotonically increasing function to drop valid.
2024-09-19 05:33:37 -04:00
chenyu
162ead02a9 remove LOAD where valid is an empty set (#6579)
356 -> 354 valids
2024-09-18 03:49:41 -04:00
chenyu
a72d51e277 brute force VALIDHACK matching (#6575)
* brute force VALIDHACK matching

* cleanup

* 9700
2024-09-18 01:59:50 -04:00
qazal
d8e5d5c663 move VIZ=1 tests to fuzzers (#6574) 2024-09-18 12:12:03 +08:00
George Hotz
67a03e72bb remove expr_idxs [run_process_replay] (#6567)
* remove expr_idxs [run_process_replay]

* goodbye that test
2024-09-17 18:34:51 +08:00
chenyu
5fb877c78c generic valid match criteria of #6552 (#6558)
455 -> 364 valids.
generalize `idx < image bound` to `idx < image bound + c` for some `c`
2024-09-17 02:40:36 -04:00
George Hotz
0ab06d5840 push geps through wmma (#6559)
* push geps through wmma

* update tests
2024-09-17 14:38:40 +08:00
chenyu
7c942418a1 other side of simple out of bound valid case (#6552)
462 -> 455
2024-09-16 23:57:15 -04:00
chenyu
aeaf7894a7 more generic version of #6548 (#6549)
x*(-1)<0 can be generalized to x*(-1)<c, 473 -> 462 valids
2024-09-16 23:17:16 -04:00
chenyu
596f41eb46 simple drop image valid case (#6548)
* simple drop image valid case

started unit test, 530 -> 473 valids

* cleanup
2024-09-16 22:54:07 -04:00
chenyu
798be6bb74 add gated read_image count in openpilot compile2 (#6546)
530 to go
2024-09-16 21:17:00 -04:00
George Hotz
cd90092f14 graph rewrite tests (#6519)
* more graph rewrite tests

* more complex test cases

* more tests

* more tests

* cleanups

* 9600 lines

* cleanups
2024-09-15 17:29:16 +08:00