Commit Graph

10633 Commits

Author SHA1 Message Date
Francis Lam
5c5b40880f search: fix edge cases on screening potential ops (#4394)
* search: fix edge cases on screening potential ops

won't change correctness, but will save a little python time by
properly deduplicating potential actions

* check for de-duplication instead of exact valid actions

* refactor long line
2024-05-02 14:53:05 -04:00
George Hotz
89030b238a add consecutive property to shapetracker 2024-05-02 10:41:28 -07:00
George Hotz
2786dff26d new disk tensor tests (#4393) 2024-05-02 08:54:44 -07:00
chenyu
7492e5d3e7 resnet correct log name for red (#4390) 2024-05-02 10:58:55 -04:00
chenyu
bf31837e6d resnet correct steps_in_val_epoch in logging (#4389)
also added random seed from system in scripts
2024-05-02 10:51:36 -04:00
George Hotz
c8a2047377 testing for all reduce (#4387) 2024-05-02 06:34:10 -07:00
ym555
3113785604 Llama 3 Models (#4339)
* Full Impl

* fix test

* Fix inference loop

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-05-02 06:06:07 -07:00
qazal
0b47818e0f simpler reduceop children chasing (#4350)
* simplest case

* midreduce case

* all tests

* pending things

* unify tests
2024-05-02 15:15:30 +03:00
chenyu
22376e53b7 resnet mlperf logging (#4361)
* resnet mlperf logging

* cropping too much?
2024-05-02 00:00:04 -04:00
George Hotz
f635c4d273 fix define global (#4383)
* fix define global

* remove name from DEFINE_GLOBAL

* fix fuzzing

* fix ptx

* fix python
2024-05-01 22:32:56 -04:00
chenyu
ad116dc5c6 fill in mlperf system description (#4381)
it did not ask too many details. will put software versions later with tinygrad commit.

```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_red.json training 4.0.0
INFO -   System description checker passed for tinybox red
```

```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_green.json training 4.
0.0
INFO -   System description checker passed for tinybox green
```
2024-05-01 16:47:45 -04:00
chenyu
9358b62073 rename resnet script to dev_beam.sh and dev_run.sh (#4379)
final run_and_time needs to be one script for both. rename the old scripts
2024-05-01 14:41:35 -04:00
chenyu
6628e13a5f pad resnet eval data in model_train (#4374)
asserted if eval sample count is different from total eval file count.
2024-05-01 14:33:42 -04:00
George Hotz
105fbd7925 add 3080 support to NV 2024-05-01 11:17:01 -07:00
chenyu
826cccd54d fix mean underflow for half tensor (#4377)
* fix mean underflow for half tensor

divide only the reduce factor. added unit test and non-nan assertion in resnet training. also added a failed test cast for symbolic shape var

* skip for python backend
2024-05-01 13:38:57 -04:00
chenyu
dce7ac0160 NOCLANG=1 for tinybox green ci. (#4378)
CLANG was disabled for tinybox red for speed
2024-05-01 13:31:01 -04:00
George Hotz
272bea5100 GraphRunner (#4375)
* GraphRunner

* new metal graph

* update hsa for graph runner

* put var_vals back

* move that clear after the capture
2024-05-01 10:27:13 -07:00
chenyu
077ea6926c remove downcast_half in sum (#4376)
breaks boolean mean and other stuff
2024-05-01 11:46:44 -04:00
George Hotz
bd49d2854a hotfix: skip fetch tests always 2024-05-01 08:43:26 -07:00
George Hotz
b683d0f496 hotfix: 100% accuracy is wrong 2024-05-01 08:07:18 -07:00
George Hotz
8bcf533a84 gitignore open-images-v6TEST 2024-05-01 13:55:38 +00:00
qazal
ea06f657df fusion tests from test_opt (#4357)
* opt tests

* more sgd

* batchnorm

* models stay in external
2024-05-01 16:44:12 +03:00
George Hotz
995d264666 hotfix: add CNAME to put docs at docs.tinygrad.org 2024-04-30 23:17:35 -07:00
chenyu
683b7c605a pad first batch of imagenet dataloader and update eval (#4368)
* pad first batch of imagenet dataloader and update eval

* pad zero instead of empty for training
2024-05-01 00:21:52 -04:00
wozeparrot
4a26718ca9 feat: tinyboxgreen (#4365) 2024-04-30 19:05:37 -04:00
Francis Lam
16838eae08 mlperf/resnet: update tinybox_red parameters to new best values (#4364)
about 27 minutes to setup and 345ms/110TF steps
2024-04-30 18:08:12 -04:00
George Hotz
27ee49bf30 tensor variable (#4362)
* tensor variable support

* consttype without variable?

* __setitem__

* symbolic mean works

* arange test

* more tests

* a few more tests
2024-04-30 14:08:57 -07:00
nimlgen
d2f89615b2 remove aql remnants in amd (#4346) 2024-04-30 23:36:02 +03:00
Francis Lam
0d33c54d99 kernel: change PADTO check to allow up to 4x padding (#4354)
* kernel: change PADTO check to allow up to 4x padding

also optionally remove PADTO from the search action space with
BEAM_PADTO=0.

* fix test_linearizer test_tensor_cores_padded tests

* update resnet runs to use SPLIT_REDUCEOP=1

* fix up search TC axis and amt checking

* fix up the dimensions of the TC tests
2024-04-30 15:29:34 -04:00
Elias Wahl
babe87a8ae BERT: Checkpoint loading tests (#4359)
* Move checkpoint init to helpers. Add test

* linters

* Move the steps outside of the main train loop

* Move data_get

* data_get belongs to helpers
2024-04-30 14:43:41 -04:00
Francis Lam
c12bcabb07 search: fix actions space checks to ignore TC axis and amt (#4360)
* search: fix actions space checks to ignore TC axis and amt

* add test for number of actions in get_linearizer_actions
2024-04-30 14:02:22 -04:00
chenyu
fdc8fabae5 disable flaky mac gpt2 beam benchmark and add back cifar mac with JIT=2 (#4358)
* debug flaky mac gpt2 beam run

* disable for now
2024-04-30 10:41:37 -04:00
George Hotz
d325be2540 update docs (#4356)
* update docs

* nn.md

* mnist cleanups

* rhip test is very slow
2024-04-30 16:51:42 +09:00
Sohaib
a2d81514fd just get dtype from kwargs (#4355) 2024-04-30 16:26:14 +09:00
Francis Lam
a9a1fa6bbf wmma: add reduce axis choice to TC action space (#4328)
* wmma: add reduce axis choice to TC action space

* add test for TC multi-reduce axis choice
2024-04-29 19:15:39 -04:00
chenyu
93abcd3113 fix function.py sum backward without downcast_half (#4353)
without downcast_half, sum output dtype can be different from input dtype. cast back to input dtype in function.py
2024-04-29 17:53:02 -04:00
Francis Lam
18c61ce077 test/fuzz_linearizer: add --atol/rtol and change half distribution (#4352) 2024-04-29 15:53:59 -04:00
Elias Wahl
71ff68b445 dropout after eval step (#4351) 2024-04-29 15:47:21 -04:00
Elias Wahl
27613dd881 MLPerf BERT: Main training loop (#4288)
* BERT language modeling head + trunc normal initializers

* add train loop + helpers

* shuffle in dataloaders + slight changes in main loop

* beam change

* Minor changes

* random.shuffle

* HParam update

* Use deque for dataloader

* wandb bert project name

* half fixes

* BENCHMARK + remove epoch

* cast + print()

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-04-29 14:35:27 -04:00
Sohaib
61c97d5305 refactor ops_gpu ctypes (#4331)
* refactor ops_gpu ctypes

- remove redundant byref as ctypes automatically handles passing `type` as
  `POINTER(type)`
- use walrus operator instead of init_c_var when possible

* clSetKernelArg argtype is POINTER(None)
2024-04-30 01:33:34 +08:00
qazal
cc1797673e all fusion opportunities (#4348) 2024-04-29 19:32:23 +03:00
chenyu
f363f39e83 fix dtype of const folded sum (#4349)
const folding sum should return in the same dtype the same as regular sum, which can be different from input dtype
2024-04-29 11:40:45 -04:00
geohotstan
bf412aeb80 use tolist instead of numpy for extracting parameters in onnx (#4333)
* still some numpy left

* all pass

* oops indent

* fix up safe_python

* to_python_const
2024-04-29 10:48:20 -04:00
qazal
774a9b0bca override assign_target in fuzz_schedule (#4342)
* store assign_targets

* cleanup

* override target
2024-04-29 11:04:04 +03:00
Francis Lata
bb849a57d1 [MLPerf] UNet3D dataloader (#4343)
* add support for train/val datasets for kits19

* split dataset into train and val sets

* add tests for kits19 dataloader

* add MLPerf dataset tests to CI

* update unet3d model_eval script

* fix linting

* add nibabel

* fix how mock dataset gets created

* update ref implementation with permalink and no edits

* clean up test and update rand_flip implementation

* cleanups
2024-04-28 22:34:18 -04:00
chenyu
82d0ed3cf3 cap default dataset wikipedia max_workers to 32 (#4345)
64 on tinybox OOM
2024-04-28 21:55:21 -04:00
chenyu
c1d8d425eb fix mean of half tensor if sum is greater than hlaf.max (#4327)
sum of half does acc in float32 already, add an arg to not downcast to half and use that in mean
2024-04-28 18:04:54 -04:00
qazal
e027879475 hotfix: remove double assignment (#4340) 2024-04-28 13:41:31 -04:00
qazal
23445db2b9 no skipped tests in RHIP (#4337)
* delete skip

* delete split skip

* remu dev

* compiler fails here

* Revert "remu dev"

This reverts commit 28b933d4eb.
2024-04-28 12:23:05 -04:00
Obada Khalili
e4befa41d7 Fix in _reshape_mask (#4332)
* handle reshape with remainder in _reshape_mask

* remove trailing whitespce

* use helper_test_op to generate tensors from shapes

* test in shapetracket too

* remove whitespace

* revert property name in other class tests
2024-04-28 11:57:39 -04:00