Commit Graph

885 Commits

Author SHA1 Message Date
Anthony DeMattos
953ef1b57e tinychat ui +/- 20 lines (#7471)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-11-06 14:23:55 +08:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
George Hotz
72a9ac27e9 support image dtype in cloud [pr] (#7482)
* support image dtype in cloud [pr]

* remove outdated osx hack

* unused imports
2024-11-02 23:54:27 +08:00
Tobias Fischer
7c9a1d69f9 sdxl gen fix (#7459) 2024-11-01 13:57:01 -04:00
gonutz
e7cbc6dc23 Fix ValueError in Yolo 8 example (#7387)
Calling

    python3 examples/yolov8.py ./test/models/efficientnet/Chicken.jpg

used to result in this error

    ValueError: Calling nonzero on 0d arrays is not allowed.

Using np.atleast_1d makes sure we avoid a zero-dimension array.

Co-authored-by: gonutz <gonutz@fake.mail>
2024-10-30 10:18:39 +08:00
George Hotz
3989bd2682 idiv + reciprocal [pr] (#7354)
* idiv + reciprocal

* remove upcast from div

* fix docs
2024-10-29 15:54:19 +08:00
chenyu
4a03e00aa1 fix llama3 download_model assert (#7320)
false positive if download_model and model are not provided
2024-10-27 11:20:24 -04:00
eliotgolding
e920f1d663 Llama 3.2 1B load from GGUF (#7295)
* gguf 1b-instruct

* not needed
2024-10-27 09:29:02 +08:00
George Hotz
dc3148c677 hotfix: minor speed increase + stable diffusion relax 2024-10-25 16:27:21 +08:00
leopf
87877d7a91 GGUF cleanup (#7192)
* cleanup

* remove vocab size hard code
2024-10-21 10:44:54 -04:00
leopf
b6d9b276bb GGUF support (#7046)
* basic loader, untested

* testing

* remove utils import in test

* q8_0

* q4_1

* end to end testing

* minor cleanup

* fix casting

* moved to state

* move tests

* move dequant to fn

* fix lint elif

* remove gguf from extra

* fix dict union

* q6_k simpler

* naming and spacing

* gpt2-gguf example

* cleanup

* move gguf example

* minor cleanup

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-10-21 16:15:34 +08:00
qazal
30989fb459 changes from the big graph branch [pr] (#7160)
* metaops srcs

* delete multioutput ctx var

* always has metadata

* shorter path for realized

* this still needs inputs

This reverts commit a59cbb2886.
2024-10-19 16:22:37 +03:00
Francis Lata
90eff347e2 tinytqdm write support (#6359)
* add write support

* add test

* update test case to compare write outputs

* assert final write output

* flush when using write

* update write logic

* Revert "update write logic"

This reverts commit 5e0e611b46.

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-10-16 14:51:41 -04:00
George Hotz
3169cb386d remove graph [pr] (#7085) 2024-10-16 11:40:07 +08:00
George Hotz
26df50cf43 move memory_planner to memory.py [pr] (#7079) 2024-10-16 10:04:35 +08:00
chenyu
ed1ed9e4ff bert use BS=72 (#7015)
memory 131 -> 138
green tflops 201 -> 209
red tflops 160 -> 169
2024-10-12 09:41:56 -04:00
George Hotz
a71bb09ec3 remove symbolic file [pr] (#7012) 2024-10-12 18:44:44 +08:00
George Hotz
5c9f76e274 hotfix: openpilot compile3 compare to i==1 2024-10-12 09:44:24 +08:00
chenyu
36056e0760 update mlperf systems and copy 4.1 to 5.0 (#7004) 2024-10-11 16:20:34 -04:00
chenyu
0e42662f2a log seed at the right place for bert (#7000) 2024-10-11 10:39:40 -04:00
nimlgen
5496a36536 update red mlperf bert readme (#6969) 2024-10-11 13:08:06 +03:00
Friedrich Carl Eichenroth
859d6d0407 Fix mypy examples/beautiful_*.py (#6978)
* fix mypy examples/beautiful_*.py

* backwards

* add test

* Revert "add test"

This reverts commit 4d88845ba3.

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-10-10 11:34:29 -04:00
Kinvert
960c495755 added beautiful fashion mnist and example (#6961)
* added beautiful fashion mnist and example

* fixing whitespace

* refactor Fashion MNIST to fewer lines

* fix newline to reduce diff

* Update beautiful_mnist.py

* Update beautiful_mnist.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-10-10 12:01:07 +08:00
chenyu
b5546912e2 10% more TRAIN_STEPS for bert (#6971)
got two very close run, adding more steps for buffer
2024-10-09 19:21:43 -04:00
chenyu
35cf48659b limit beam param for bert on green (#6966)
seems to mitigate the crash
2024-10-09 11:48:18 -04:00
chenyu
1ff2c98f8a fix logfile name for bert red (#6952) 2024-10-08 05:37:52 -04:00
chenyu
a78c96273a update bert epoch logging (#6940)
* update bert epoch logging

epoch for bert is simply number of examples seen (which is used for RCP check)

* update total steps too

* more changes
2024-10-08 00:34:06 -04:00
chenyu
102dfe5510 back to 2**10 for bert loss scaler (#6934)
getting 2 NaN for this, revert back to 2**10
2024-10-07 10:17:21 -04:00
chenyu
0cf815a93a bert use BS=66 and update hparams (#6932)
with dropout memory improvement, we can fit BS=66 now. revert back to the hparams in #5891 too
2024-10-07 05:08:27 -04:00
chenyu
718b959349 log epoch start and stop for bert (#6912) 2024-10-06 06:39:46 -04:00
chenyu
16c1fa4208 use BEAM=3 for red box bert runs (#6904)
BEAM=4 slightly exceeded 30 minutes setup
2024-10-05 09:21:12 -04:00
chenyu
0e706227a2 add seed to bert result log filename (#6903)
* add seed to bert result log filename

* different name for different benchmark
2024-10-05 09:15:24 -04:00
George Hotz
f4ec39fe58 switch symbolic from old to uops, final PR (#6872)
* switch symbolic from old to uops, final PR

* two wrong answers

* not needed resolves

* symbolic ops passes

* symbolic ops passes

* progress

* tests pass (almost)

* fix last test

* fix some tests

* global binding and unbinding

* Revert "global binding and unbinding"

This reverts commit 9456725630.

* that test works now

* vars on uop doesn't recurse

* fix fuzzer

* update

* fix type

* fix gpt, it's UOp now

* ssimplify symbolics
2024-10-04 16:42:27 +08:00
chenyu
7391376528 update bert hparams (#6876)
4h32m with this https://wandb.ai/chenyuxyz/MLPerf-BERT/runs/q99frv1l/overview.

loss scaler 2**13->2**10. matched the closest submission, no nan for ~10 runs.

increased lr and total step a bit.

`PARALLEL=0` after setup, same as resnet.
2024-10-04 00:39:06 -04:00
chenyu
5f77217772 bert default CKPT to 0 (#6840)
not required
2024-10-01 21:55:56 -04:00
George Hotz
547733e57c stunning_mnist [run_process_replay] (#6828)
* stunning_mnist [run_process_replay]

* add loss to stunning mnist
2024-10-01 15:00:48 +08:00
chenyu
f59517754e add RESET_STEP in bert to control reset (#6818)
same as resnet
2024-09-30 09:39:04 -04:00
George Hotz
2ed94e447f gpt2: corealize opt and loss 2024-09-30 09:11:20 +08:00
George Hotz
a76c6c740c hand pad gpt2 (#6805) 2024-09-30 09:03:07 +08:00
chenyu
494b20e886 bert BS back to 54 (#6791)
60 does not run end to end
2024-09-27 22:16:05 -04:00
chenyu
572d77d1d9 bert script delete eval data after eval (#6790)
fits BS=60 which is 2% faster than 54. also fixed wandb logging params
2024-09-27 20:54:00 -04:00
chenyu
f9c8e144ff chmod +x mlperf bert script for red (#6789)
also disabled raising power cap in setup. wozeparrot mentioned that's unstable and might cause bert training issue on red
2024-09-27 11:27:32 -04:00
Francis Lata
d3a387be63 [MLPerf] Prepare openimages dataset script (#6747)
* prepare openimages for MLPerf

* cleanup

* fix issue when clearing jit_cache on retinanet eval

* revert pandas specific changes
2024-09-27 11:13:56 -04:00
chenyu
2fc26890c9 default BS=9 in handcode_opt bert (#6783)
using 54 for 6 gpus now, and 2 is not a good default
2024-09-27 04:38:16 -04:00
George Hotz
9a3f6f392d llm.c tok/s 2024-09-27 00:46:18 -07:00
George Hotz
b0e70ab04f llm.c updates 2024-09-27 15:25:59 +08:00
chenyu
bea7ed5986 add RUNMLPERF=1 to bert dev_run.sh (#6775)
already set in run_and_time.sh, need RUNMLPERF=1 for it to load real data
2024-09-26 11:00:49 -04:00
chenyu
12de203a43 add IGNORE_JIT_FIRST_BEAM to bert scripts (#6769)
* update bert BEAM params

copied from resnet to start with

* just IGNORE_JIT_FIRST_BEAM
2024-09-26 05:38:24 -04:00
wozeparrot
15cd42cfb9 feat: support TRACEMETA=2 in handcode_opt (#6767) 2024-09-26 16:58:29 +08:00
chenyu
5a5fbfa1eb smaller bert script change (#6768)
only WANDB and RUNMLPERF order. BENCHMARK and BEAM will be done differently
2024-09-26 04:54:28 -04:00