tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Files

chenyu ff05bff221 put bert data shard inside jit (#9160 )

python time 45ms -> 9ms, it was spending time to schedule the shard

also init bert data on CLANG since it's from numpy, so we don't create the tensor on default device then shard into GPUS

2025-02-18 10:36:54 -05:00

scripts

UNet3D MLPerf (#3470 )

2024-09-10 04:37:28 -04:00

training_submission_v4.0/tinycorp

copy mlperf 4.0 to mlperf 4.1 (#5614 )

2024-07-20 16:12:00 -04:00

training_submission_v4.1/tinycorp

update mlperf systems and copy 4.1 to 5.0 (#7004 )

2024-10-11 16:20:34 -04:00

training_submission_v5.0/tinycorp

free_intermediates in bert (#9040 )

2025-02-12 10:00:39 -05:00

dataloader.py

put bert data shard inside jit (#9160 )

2025-02-18 10:36:54 -05:00

helpers.py

put bert data shard inside jit (#9160 )

2025-02-18 10:36:54 -05:00

initializers.py

Tuple -> tuple, List -> list [pr] (#8936 )

2025-02-06 14:21:19 -05:00

losses.py

[MLPerf][UNet3D] Add DICE loss + metrics (#4204 )

2024-04-17 20:09:33 -04:00

lr_schedulers.py

fp16 resnet (without expand backwards sum in float, doesn't work) (#3816 )

2024-03-28 01:25:37 -04:00

metrics.py

[MLPerf][UNet3D] Add DICE loss + metrics (#4204 )

2024-04-17 20:09:33 -04:00

model_eval.py

[MLPerf] Prepare openimages dataset script (#6747 )

2024-09-27 11:13:56 -04:00

model_spec.py

move globalcounters to ops (#2960 )

2024-01-01 14:21:02 -08:00

model_train.py

put bert data shard inside jit (#9160 )

2025-02-18 10:36:54 -05:00

README

start on mlperf models

2023-05-10 16:30:49 -07:00

README

Each model should be a clean single file.
They are imported from the top level `models` directory

It should be capable of loading weights from the reference imp.

We will focus on these 5 models:

# Resnet50-v1.5 (classic) -- 8.2 GOPS/input
# Retinanet
# 3D UNET (upconvs)
# RNNT
# BERT-large (transformer)

They are used in both the training and inference benchmark:
https://mlcommons.org/en/training-normal-21/
https://mlcommons.org/en/inference-edge-30/
And we will submit to both.

NOTE: we are Edge since we don't have ECC RAM