* [WIP]: implementation of VITS TTS model
* Implemented VITS model, moved all code to examples/vits.py
* Added support for vctk model, auto download, and cleanups
* Invoke tensor.realize() before measuring inference time
* Added support for mmts-tts model, extracted TextMapper class, cleanups
* Removed IPY dep, added argument parser, cleanups
* Tiny fixes to wav writing
* Simplified the code in a few places, set diff log level for some prints
* Some refactoring, added support for uma_trilingual model (anime girls)
* Fixed bug where embeddings are loaded with same backing tensor, oops
* Added emotional embed support, added cjks + voistock models
- voistock is multilingual model with over 2k anime characters
- cjks is multilingual model with 24 speakers
both are kinda bad for english though :c
* Removed `Tensor.Training=False` (not needed and wrong oop)
* Changed default model and speaker to vctk with speaker 6
* Ported rational_quadratic_spline fun to fully use tinygrad ops, no numpy
* Removed accidentally pushed test/spline.py
* Some slight refactors
* Replaced masked_fill with tensor.where
* Added y_length estimating, plus installation instructions, plus some cleanups
* Fix overestimation log message.
* Changed default value of `--estimate_max_y_length` to False
This is only useful for larger inputs.
* Removed printing of the phonemes
* Changed default value of `--text_to_synthesize`