word-level timestamps in transcribe() (#869)

* word-level timestamps in `transcribe()`

* moving to `timing.py`

* numba implementation for dtw, replacing dtw-python

* triton implementation for dtw

* add test for dtw implementations

* triton implementation of median_filter

* a simple word-level timestamps test

* add scipy as dev dependency

* installs an older version of Triton if CUDA < 11.4

* fix broken merge

* loosen nvcc version match regex

* find_alignment() function

* miscellaneous improvements

* skip median filtering when the input is too small

* Expose punctuation options in cli and transcribe() (#973)

* fix merge error

* fix merge error 2

* annotating that word_timestamps is experimental

---------

Co-authored-by: ryanheise <ryan@ryanheise.com>
This commit is contained in:
Jong Wook Kim
2023-03-06 17:00:49 -05:00
committed by GitHub
parent eab8d920ed
commit 500d0fe966
14 changed files with 768 additions and 77 deletions

View File

@@ -1,3 +1,4 @@
numba
numpy
torch
tqdm