* add DICE loss and metrics
* update dice to include reference implementation's link
* remove unused imports
* remove unnecessary test file and update pred + label for metrics and losses test
* add tests to CI + add exclusion of mlperf_unet3d
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* Add ResNet inference test and cannon
* Test with ResNet50
* test_car works with resnet fix
* Add KiTS19 dataset
* KiTS19: Implement iterate
* No batch load for this dataset
* Save results on iterate
* Implement dice score
* Add data prep and eval functions
* Resolve shape issue
* Conversion works but wrong values
* Segfaults when load_from_pretrained is called
* Fix segfault and assign properly
* Final result generated, though very slow
* Store and load final result to save time
* Fix typo in finalize
* Score computes
* More bug fixes, dice score is very low
* Working broken code
* Assign output values to result
* Getting a much higher score now
* Fix dataset preprocessing
* Mean DICE score of 88.5
* Ugh, typo
* Attempt to reimplement model
* Rename layers
* Tiny model works, kinda
* Accuracy? gone
* Implement InstanceNorm and match torch
* Test instance norm 2d and 3d
* Combined input block with downsample block
* Tiny model works, support strided convtranspose
* Commands to download dataset
* Clean up a bit
* unet3d_v2 -> unet3d
* Remove duplicated code
* Oops, put tests back
* feat: add mlperf bert model
* feat: switch to nn.Embedding
* clean+fix: fix formatting
* feat: add simple downloader
* feat: metrics
* feat: don't actually need exact match
* feat: doing a run
* feat: set eps on the layernorms
* clean+fix: cleaner impl + hopefully fixed
* feat: move dataset initialization into iterate
* feat: move tokenizer out of iterate
* clean+fix: cleaner + working
* clean: cleanup
* fix: fix metrics
* feat: need to use original bert gelu + download vocab
* feat: make directory if it doesn't exist yet
* feat: jit go brrr
* feat: initial rnn-t
* feat: working with BS>1
* feat: add lstm test
* feat: test passing hidden
* clean: cleanup
* feat: specify start
* feat: way faster lstm & model
* fix: default batch size
* feat: optimization
* fix: fix metrics
* fix: fix feature splicing
* feat: cleaner stacktime
* clean: remove unused import
* clean: remove extra prints
* fix: fix tests and happy llvm
* feat: have the librispeech dataset in its own dir
* clean: unused variable
* feat: no longer need numpy for the embedding + slightly more memory efficient lstm
* fix: forgot to remove something that broke tests
* feat: use relative paths
* feat: even faster
* feat: remove pointless transposes in StackTime
* fix: correct forward
* feat: switch to soundfile for loading and fix some leaks
* feat: add comment about initial dataset setup
* feat: jit more things
* feat: default batch size back to 1
larger than 1 is broken again :(
and even in the reference implementation it gives worse results