* lr schedulers + test
* lr scheduler test moved + integration test
* integration test for all lr scheduler
* lr scheduler test now deterministic
* changed optimizer + parameters for lr sched test
* optimizations in symbolic.py
* fix infinite recursion when expanding sums
* add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out
* Don't collapse dimensions during batched matmul (FIX#799)
* Avoid reshaping tensor to the same shape
* Skip batched matrix multiply when IMAGE is set
* feat: promote Embedding to nn
* fix: fix failing test
* feat: add test with jit
* feat: rewrite embedding to no longer need stacked for loops
* clean+fix: don't know how that happened
* feat: initial rnn-t
* feat: working with BS>1
* feat: add lstm test
* feat: test passing hidden
* clean: cleanup
* feat: specify start
* feat: way faster lstm & model
* fix: default batch size
* feat: optimization
* fix: fix metrics
* fix: fix feature splicing
* feat: cleaner stacktime
* clean: remove unused import
* clean: remove extra prints
* fix: fix tests and happy llvm
* feat: have the librispeech dataset in its own dir
* clean: unused variable
* feat: no longer need numpy for the embedding + slightly more memory efficient lstm
* fix: forgot to remove something that broke tests
* feat: use relative paths
* feat: even faster
* feat: remove pointless transposes in StackTime
* fix: correct forward
* feat: switch to soundfile for loading and fix some leaks
* feat: add comment about initial dataset setup
* feat: jit more things
* feat: default batch size back to 1
larger than 1 is broken again :(
and even in the reference implementation it gives worse results
* e2e testing
* min failure
* no affine on bn, still fails
* why did i think i could detach that?
* allow more kernels for bn
* some test issue i don't understand
* no zeroview start
* closer
* stride mask
* st tests pass, delete ZeroView
* byebye zv
* close to working
* not contiguous with mask
* subtract, don't add
* mask on view
* ugh, that shouldn't have been in there
* shape merge
* bugfixes
* fuzzer + 4 fuzzer failures
* fuzzer for symbolic
* more fuzzing and nothing
* that fuzzer doesn't hit either
* fixes padding...ugh
* no more offsets
* working
* rewrite load and store
* all checks
* fix idxs
* progress
* bugfix
* float4_axis
* works
* cleanups
* complex valids_okay
* make maximum split grad
* added test for maximum split grad when equal
* minor expr simplification
* (2-eq)/2 only once
* update test bc one more sum output child stays
* activation ops
* type hints + more testing
* formatting correction + parameter testing
* fixes to shape testing
* hardtanh to use clip + removed type hints
* assign val fix
* fix binop, other tests failure
* that was a bad idea
* better layernorm
* inference kernel count tests
* new style reshape pushing
* fixup replacement
* 199 kernels is okay. fix flops
* push reshape through unaryops only
* GRAPH=2 draws the phantom ops
* found resnet issue
* non working test
* mul is cheaper than div
* OPT inflation
* SHUFFLE_PAD_OPS in OPT=2
* linearizer outputs something
* working ish
* cstyle codegen
* clang mostly works
* fix load valid
* fix numberless loop
* fancy gen
* working
* fix enet compiler
* cleanups
* float4 upcasting
* less lines
* supports_float4
* constant folding
* mulacc
* internet tests flaky in CI
* 90% image support
* fix image generic
* bugs exposed with shapetracker and single view
* new llvm
* use vload, remove OLD
* that's really poorly done
* ending up being more lines
* runs one metal kernel
* conv2d works
* ops tests are passing
* const folding
* all ops work
* pre commit always passes
* torch works
* working still
* fix graph test
* tests passing
* image almost works
* image conv works
* most images
* fix custom
* fix assignment
* fix compile enet
* clean up comments
* fix realize return value
* include shapetracker in LB repr
* copy should make a copy
* reenable method cache
* fix lna
* dtypes in graph
* forward only for IMAGE=2
* simple realize
* getting close
* fixup new api, it's good except the kernel count
* back to 197 kernels
* tests should pass
* go to a real float
* no type_on_cpu
* fix the docs
* put shapetracker back in it's proper place