George Hotz
144e9f00df
viz is local, new test, and new quantize [pr] ( #7859 )
...
* viz is local, new test, and new quantize [pr]
* fix mime types
* remove font
* after index
2024-11-23 14:27:10 +08:00
qazal
9828277c03
view doesn't have buffer, fix the tests [pr] ( #7841 )
...
* view doesn't have buffer, fix the tests [pr]
* need assigns
2024-11-22 20:41:55 +08:00
chenyu
69e382216d
fix wino conv output dtype for half inputs ( #7829 )
2024-11-21 12:13:54 -05:00
George Hotz
fbb4099b3c
add test for compile3 [pr] ( #7783 )
...
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-11-19 19:26:51 +08:00
chenyu
73ea913050
really not using numpy in gpt2 example ( #7779 )
2024-11-18 23:21:16 -05:00
chenyu
e6debda5c4
remove numpy from gpt2 and llama examples ( #7778 )
2024-11-18 22:48:17 -05:00
ignaciosica
597a239e28
Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] ( #7725 )
...
* remove unaryops
* remove ternaryops
* remove metaops
* hotfix
* remove binaryops
* hotfix: test_pattern_matcher
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-11-16 20:56:56 +08:00
geohotstan
f8056a74d6
combine pad2d with pad ( #7677 )
...
* I have pad2d, I have pad, uuh~, pad2dpad~
* fix some small things
* strategically placed cast hack
* fix more
* fix more more
* tests
* periods
2024-11-14 17:56:02 +08:00
chenyu
4c5f7ddf1f
flux set model path in args ( #7660 )
...
in addition to default downloading through fetch, add an arg to pass model path directly
2024-11-12 22:11:40 -05:00
Harald Schäfer
e7cbc29f48
openpilot benchmark: add cast from numpy to benchmark ( #7593 )
...
* openpilot benchmark: add cast from numpy to benchmark
* whitespace
* comment
2024-11-08 19:31:00 +08:00
Anthony DeMattos
953ef1b57e
tinychat ui +/- 20 lines ( #7471 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-11-06 14:23:55 +08:00
George Hotz
c8bf09b7d4
s/UOps/Ops ( #7500 )
...
* s/UOps/Ops [pr]
* fix
2024-11-03 11:26:10 +08:00
George Hotz
72a9ac27e9
support image dtype in cloud [pr] ( #7482 )
...
* support image dtype in cloud [pr]
* remove outdated osx hack
* unused imports
2024-11-02 23:54:27 +08:00
Tobias Fischer
7c9a1d69f9
sdxl gen fix ( #7459 )
2024-11-01 13:57:01 -04:00
gonutz
e7cbc6dc23
Fix ValueError in Yolo 8 example ( #7387 )
...
Calling
python3 examples/yolov8.py ./test/models/efficientnet/Chicken.jpg
used to result in this error
ValueError: Calling nonzero on 0d arrays is not allowed.
Using np.atleast_1d makes sure we avoid a zero-dimension array.
Co-authored-by: gonutz <gonutz@fake.mail >
2024-10-30 10:18:39 +08:00
George Hotz
3989bd2682
idiv + reciprocal [pr] ( #7354 )
...
* idiv + reciprocal
* remove upcast from div
* fix docs
2024-10-29 15:54:19 +08:00
chenyu
4a03e00aa1
fix llama3 download_model assert ( #7320 )
...
false positive if download_model and model are not provided
2024-10-27 11:20:24 -04:00
eliotgolding
e920f1d663
Llama 3.2 1B load from GGUF ( #7295 )
...
* gguf 1b-instruct
* not needed
2024-10-27 09:29:02 +08:00
George Hotz
dc3148c677
hotfix: minor speed increase + stable diffusion relax
2024-10-25 16:27:21 +08:00
leopf
87877d7a91
GGUF cleanup ( #7192 )
...
* cleanup
* remove vocab size hard code
2024-10-21 10:44:54 -04:00
leopf
b6d9b276bb
GGUF support ( #7046 )
...
* basic loader, untested
* testing
* remove utils import in test
* q8_0
* q4_1
* end to end testing
* minor cleanup
* fix casting
* moved to state
* move tests
* move dequant to fn
* fix lint elif
* remove gguf from extra
* fix dict union
* q6_k simpler
* naming and spacing
* gpt2-gguf example
* cleanup
* move gguf example
* minor cleanup
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-21 16:15:34 +08:00
qazal
30989fb459
changes from the big graph branch [pr] ( #7160 )
...
* metaops srcs
* delete multioutput ctx var
* always has metadata
* shorter path for realized
* this still needs inputs
This reverts commit a59cbb2886 .
2024-10-19 16:22:37 +03:00
Francis Lata
90eff347e2
tinytqdm write support ( #6359 )
...
* add write support
* add test
* update test case to compare write outputs
* assert final write output
* flush when using write
* update write logic
* Revert "update write logic"
This reverts commit 5e0e611b46 .
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-16 14:51:41 -04:00
George Hotz
3169cb386d
remove graph [pr] ( #7085 )
2024-10-16 11:40:07 +08:00
George Hotz
26df50cf43
move memory_planner to memory.py [pr] ( #7079 )
2024-10-16 10:04:35 +08:00
chenyu
ed1ed9e4ff
bert use BS=72 ( #7015 )
...
memory 131 -> 138
green tflops 201 -> 209
red tflops 160 -> 169
2024-10-12 09:41:56 -04:00
George Hotz
a71bb09ec3
remove symbolic file [pr] ( #7012 )
2024-10-12 18:44:44 +08:00
George Hotz
5c9f76e274
hotfix: openpilot compile3 compare to i==1
2024-10-12 09:44:24 +08:00
chenyu
36056e0760
update mlperf systems and copy 4.1 to 5.0 ( #7004 )
2024-10-11 16:20:34 -04:00
chenyu
0e42662f2a
log seed at the right place for bert ( #7000 )
2024-10-11 10:39:40 -04:00
nimlgen
5496a36536
update red mlperf bert readme ( #6969 )
2024-10-11 13:08:06 +03:00
Friedrich Carl Eichenroth
859d6d0407
Fix mypy examples/beautiful_*.py ( #6978 )
...
* fix mypy examples/beautiful_*.py
* backwards
* add test
* Revert "add test"
This reverts commit 4d88845ba3 .
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-10 11:34:29 -04:00
Kinvert
960c495755
added beautiful fashion mnist and example ( #6961 )
...
* added beautiful fashion mnist and example
* fixing whitespace
* refactor Fashion MNIST to fewer lines
* fix newline to reduce diff
* Update beautiful_mnist.py
* Update beautiful_mnist.py
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-10 12:01:07 +08:00
chenyu
b5546912e2
10% more TRAIN_STEPS for bert ( #6971 )
...
got two very close run, adding more steps for buffer
2024-10-09 19:21:43 -04:00
chenyu
35cf48659b
limit beam param for bert on green ( #6966 )
...
seems to mitigate the crash
2024-10-09 11:48:18 -04:00
chenyu
1ff2c98f8a
fix logfile name for bert red ( #6952 )
2024-10-08 05:37:52 -04:00
chenyu
a78c96273a
update bert epoch logging ( #6940 )
...
* update bert epoch logging
epoch for bert is simply number of examples seen (which is used for RCP check)
* update total steps too
* more changes
2024-10-08 00:34:06 -04:00
chenyu
102dfe5510
back to 2**10 for bert loss scaler ( #6934 )
...
getting 2 NaN for this, revert back to 2**10
2024-10-07 10:17:21 -04:00
chenyu
0cf815a93a
bert use BS=66 and update hparams ( #6932 )
...
with dropout memory improvement, we can fit BS=66 now. revert back to the hparams in #5891 too
2024-10-07 05:08:27 -04:00
chenyu
718b959349
log epoch start and stop for bert ( #6912 )
2024-10-06 06:39:46 -04:00
chenyu
16c1fa4208
use BEAM=3 for red box bert runs ( #6904 )
...
BEAM=4 slightly exceeded 30 minutes setup
2024-10-05 09:21:12 -04:00
chenyu
0e706227a2
add seed to bert result log filename ( #6903 )
...
* add seed to bert result log filename
* different name for different benchmark
2024-10-05 09:15:24 -04:00
George Hotz
f4ec39fe58
switch symbolic from old to uops, final PR ( #6872 )
...
* switch symbolic from old to uops, final PR
* two wrong answers
* not needed resolves
* symbolic ops passes
* symbolic ops passes
* progress
* tests pass (almost)
* fix last test
* fix some tests
* global binding and unbinding
* Revert "global binding and unbinding"
This reverts commit 9456725630 .
* that test works now
* vars on uop doesn't recurse
* fix fuzzer
* update
* fix type
* fix gpt, it's UOp now
* ssimplify symbolics
2024-10-04 16:42:27 +08:00
chenyu
7391376528
update bert hparams ( #6876 )
...
4h32m with this https://wandb.ai/chenyuxyz/MLPerf-BERT/runs/q99frv1l/overview .
loss scaler 2**13->2**10. matched the closest submission, no nan for ~10 runs.
increased lr and total step a bit.
`PARALLEL=0` after setup, same as resnet.
2024-10-04 00:39:06 -04:00
chenyu
5f77217772
bert default CKPT to 0 ( #6840 )
...
not required
2024-10-01 21:55:56 -04:00
George Hotz
547733e57c
stunning_mnist [run_process_replay] ( #6828 )
...
* stunning_mnist [run_process_replay]
* add loss to stunning mnist
2024-10-01 15:00:48 +08:00
chenyu
f59517754e
add RESET_STEP in bert to control reset ( #6818 )
...
same as resnet
2024-09-30 09:39:04 -04:00
George Hotz
2ed94e447f
gpt2: corealize opt and loss
2024-09-30 09:11:20 +08:00
George Hotz
a76c6c740c
hand pad gpt2 ( #6805 )
2024-09-30 09:03:07 +08:00
chenyu
494b20e886
bert BS back to 54 ( #6791 )
...
60 does not run end to end
2024-09-27 22:16:05 -04:00