Commit Graph

1207 Commits

Author SHA1 Message Date
Tobias Fischer
8c9c1cf62f Pulled CLIP and UNet into Seperate Files (#5253)
* pulled clip and unet into seperate files

* reference cleanup, lru cache fix

* better pool indexing
2024-07-01 22:33:01 -04:00
chenyu
b9122ecdaf revert stable diffusion validation with threefry (#5248)
* Revert "use threefry in stable diffusion benchmark (#4988)"

This reverts commit 44dfa37c70.

* sdxl and validation fix

* relax threshold
2024-07-01 14:43:47 -04:00
George Hotz
3df47bc21e OpenELM + repeat_interleave (#5234)
* start writing openelm

* progress...hit bug

* repeat_interleave support

* gqa

* add rotary embedding

* spp

* i think it runs correctly

* broken

* output is good now

* cleanups

* no io_uring on android
2024-06-30 15:18:39 -07:00
chenyu
88763eb9ff fix stable_diffusion with fp16 (#5239) 2024-06-30 12:59:31 -04:00
chenyu
7090eac8cb validate sdxl output and put it in benchmark (#5211)
* validate sdxl output and put it in benchmark

* don't print fetch progress_bar in CI
2024-06-28 11:40:52 -04:00
chenyu
63fa4e2a0e fix seed = 0 in sdxl (#5209)
removed a few unneeded realize and contiguous too
2024-06-28 08:48:59 -04:00
Tobias Fischer
4688f97d48 Add SDXL Inference to Examples (#5206)
* added sdxl inference code

* fixed trailing whitespace

* use original impl code, removed uneeded numpy calls
2024-06-28 07:42:28 -04:00
chenyu
0ba093dea0 hotfix: only validate stable diffusion when using threefry (#5166) 2024-06-26 16:50:38 -04:00
chenyu
e4a5870b36 validate stable_diffusion output (#5163)
changed default steps, forgot to update validation
2024-06-26 16:42:21 -04:00
nimlgen
21b225ac45 llama3 download works (#5160) 2024-06-26 22:45:13 +03:00
wozeparrot
c91b3c4079 shard llama3 on 0 sometimes (#5157) 2024-06-26 11:50:57 -07:00
Elias Wahl
e267f3161d Add MLLogger (#5125)
* add MLPerf logger

* eval steps

* start with step 1

* compliance for 3.1.0 and 4.0.0

* more compliance

* assert, comment and contiguous
2024-06-26 12:23:56 -04:00
David Hou
3604642847 Llama shard axis 0 sometimes (#5123)
* make buffer view optional with a flag [run_process_replay]

* do not view when sharding to save memory [run_process_replay]

* llama shard axis=0 sometimes

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-26 10:35:25 -04:00
chenyu
dade7677cf validate llama3 output only with model "LLaMA-3/8B-SF-DPO" (#5138) 2024-06-24 20:58:25 -04:00
chenyu
055e616302 cleanup mnist data load in beautiful_mnist (#5106) 2024-06-22 18:31:51 -04:00
chenyu
e356807696 tinytqdm.set_description and tinytrange (#5101) 2024-06-22 14:45:06 -04:00
chenyu
8080298739 s/tinytqdm/tqdm (#5103)
except in unit test where tqdm is imported
2024-06-22 14:18:26 -04:00
chenyu
e468601226 update llama attention casting (#5096)
* update llama attention casting

updated scaled_dot_product_attention middle cast and removed hard-coded half in llama attention.

* fix that
2024-06-22 10:57:17 -04:00
wozeparrot
acb715c64c fix: llama3 special tokens (#5045) 2024-06-18 17:08:44 -07:00
chenyu
a3ed4176c8 use tinytqdm in active tests and examples (#5038)
* use tinytqdm in active tests and examples

stress test this before 0.9.1

* no set_description
2024-06-18 16:01:19 -04:00
Elias Wahl
f31ef11537 Better default hparams for large BS (#5030)
* better default hparams for large BS

* bf16 too

* use tuple
2024-06-18 11:13:06 -04:00
Elias Wahl
7bfa9101c0 Float in scaled dot product attention (#4985)
* Monkeypatch scaled-dot-product-attention

* Use dot instead of matmul

* new api

* imports

* least_upper_dtype
2024-06-18 08:16:41 -04:00
chenyu
c52352bd9a fix yolov8 example (#5003)
it was creating Tensor from a list of numpy arrays, which is not supported after moving creating from a list not using numpy.
2024-06-16 20:47:29 -04:00
chenyu
44dfa37c70 use threefry in stable diffusion benchmark (#4988)
also updated default steps to 10. easier to tell the image is following the prompt.
2024-06-15 20:25:29 -04:00
wozeparrot
ce1ed374c9 more tinychat fixes (#4971) 2024-06-15 16:29:39 -07:00
wozeparrot
8209cd3c55 easier llama3 + fetch subdir (#4938) 2024-06-14 13:47:27 -07:00
chenyu
67e8df4969 remove numpy from dtype (#4969)
replaced all dtype.np with _to_np_dtype defined in tensor.py.

after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
2024-06-14 15:38:45 -04:00
wozeparrot
2a974ff257 fix: no readablestream await of, too new (#4965) 2024-06-14 11:22:19 -07:00
Elias Wahl
d2e3c391e8 Residual in MLM loss + Change default steps (#4935)
* Residual in mlm loss

* Reduce default steps to 160K * 24

* oops

* comment
2024-06-12 16:09:18 -04:00
wozeparrot
3d13c23bfa llama3 --download_model (#4922) 2024-06-11 22:59:59 -07:00
wozeparrot
2849d0a2a1 fix copying to clipboard on a non secure context (#4890) 2024-06-08 16:51:47 -07:00
wozeparrot
6c24eda522 feat: tinychat (#4869) 2024-06-08 12:05:45 -07:00
Brennan Kinney
9445946cae docs: Update referenced yaml in yolov8.py (#4871)
YAML files have since been relocated.
2024-06-08 15:05:00 -04:00
Nik
085c0bbf6b add mlperf train subset of openimages (#4841) 2024-06-05 10:10:11 -04:00
Elias Wahl
e576aca044 Disable dropout (#4837) 2024-06-04 18:57:26 -04:00
Elias Wahl
bb248a0dd1 Optional half matmul (#4835)
* half linear

* move weight cast back

* oops

* matmul dtype var

* todo comment
2024-06-04 17:53:41 -04:00
Elias Wahl
04e237328b Refactor to class style (#4804) 2024-06-04 14:08:31 -07:00
George Hotz
eecfdd2f6e hotfix: fix dataset reading for new llm.c 2024-06-03 14:10:05 +02:00
Francis Lata
707099487a Multiprocessing UNet3D dataloader (#4801)
* testing dataloader

* matching dataloader implementation for unet3d

* remove comments

* clean up dataloader

* add cookie and cleanup

* use shm_path when creating SharedMemory

* add support for testing resnet and unet3d dataloaders

* update dataset test to return preprocesed data directory in prep for dataloader testing

* pass preprocessed dataset directory properly

* update loader function for dataloader

* add shuffling on indices

* update shm name

* more cleanup for unet3d dataloader

* remove changes to tests

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-02 11:30:47 -04:00
wozeparrot
ed0a740fe4 greater chat api endpoint compat (#4792) 2024-05-30 22:47:31 -07:00
chenyu
f2414c666f fix train_gpt2.py (#4771)
added `with Tensor.train():`
2024-05-29 12:01:34 -04:00
chenyu
7624ad3ddd add --timing and --profile to llama3 example (#4767) 2024-05-28 16:24:44 -04:00
chenyu
e614b7c696 docs: showcase remove mnist_gan and add conversation.py (#4757)
fixed both examples, and i think it's better to show conversation
2024-05-28 11:09:26 -04:00
chenyu
fd249422f5 minor cleanup example stable_diffusion (#4753) 2024-05-28 00:05:37 -04:00
Elias Wahl
c4b0acf095 Global norm + small changes (#4749)
* norm

* no empty

* default loss scaler in float
2024-05-27 18:35:27 -04:00
chenyu
31358cbea5 change Tensor.stack to method (#4719) 2024-05-24 17:04:19 -04:00
chenyu
38bc38cdff fix llama example quantize (#4699)
* fix llama example quantize

import quantize layers from new example llama3

add to mac benchmark

* fix that

* save the files
2024-05-23 15:35:26 -04:00
chenyu
792a494eb8 fix various examples (#4691)
* fix examples that used ax1 and ax2 for transpose

* fix that

* update those
2024-05-22 20:43:21 -04:00
Elias Wahl
acc0039cfc Resume fix + scheduler for non weight decay params (#4679)
* move ckpt dir

* fix resume. Add scheduler group
2024-05-21 19:38:13 -04:00
chenyu
5e3fbbb33e llama3 example add manual seed and log seed (#4667) 2024-05-20 19:09:57 -04:00