04-16-2025, 04:53 PM
|
#1 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Hi REAPER users!
I'd like to share with you a new open source project called ReaSpeech Lite. It's a rewrite of ReaSpeech (discussed previously at https://forum.cockos.com/showthread.php?t=294811 ) in C++ as a VST3/ARA plugin. Unlike the original ReaSpeech, it does not depend on Docker or any other software, other than CUDA Toolkit on Windows to get GPU acceleration. I'd love to hear from you if you try it out. Please see the announcement blog post here: https://techaud.io/blog/20250416-reaspeech-lite/ ![]() ReaSpeech Lite uses the ARA 2.0 API to read the source audio from your project and run it through an embedded whisper.cpp speech recognition model. It is built using the JUCE 8 framework, and it uses a WebView component for its user interface, which is written in TypeScript. Since ReaSpeech Lite is a VST, it can in theory run on other DAWs, though it has been mainly tested on REAPER and uses the REAPER C++ API for the "Create Markers" and similar features. Because ReaSpeech Lite uses a web interface, it has much better multi-language support compared to the original ReaSpeech, and has no trouble displaying non-Latin characters. Installation and usage instructions are available on GitHub: https://github.com/TeamAudio/reaspeech-lite Thanks, and let me know if you have any questions! |
|
|
04-16-2025, 06:47 PM
|
#2 |
|
Human being with feelings
Join Date: Jul 2021
Location: Swiss Zürich
Posts: 1,381
|
Is German language available ?
|
|
|
04-16-2025, 06:50 PM
|
#3 |
|
Human being with feelings
Join Date: Jun 2006
Posts: 22,772
|
installed the CUDA toolkit, latest drivers, no cuda version showing up.
But the regular one does! very cool, thank you for your work!! |
|
|
04-16-2025, 06:51 PM
|
#4 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Yes, Whisper supports German for transcription and translation.
|
|
|
04-16-2025, 06:53 PM
|
#5 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
Not that this is any consolation for Windows users, but if you're on Mac (Apple Silicon), it should "just work". |
|
|
|
04-17-2025, 03:00 AM
|
#6 |
|
Human being with feelings
Join Date: May 2017
Location: Somewhere over the Rainbow
Posts: 6,966
|
Is it possible to access the text with time-position somehow via Reaper's API? Like Extstates or something, including a way to "reset" them, when I want to transcribe new text?
I would like to use it to transcribe text but without it creating markers. For instance, it would be possible to use it as a way to control Reaper via voice recognition, so people with limited motor-abilities could gain accessibility by controlling Reaper via voice commands. But for this, I would need to be able to get the deciphered text without it being put somewhere into the project and a way to periodically reset the extstates/ect that store the found text. With resetting, I mean, I transcribe audio whose text is stored in extstates(including time-position), I read the extstates, I delete these extstates, so the next time I transcribe text, I don't reread the "old" transcribed text, only the new transcribed text.
__________________
Use you/she/her.Ultraschall-Api Lua Api4Reaper - ReaGirl - a GuiLib for guis working for blind people |
|
|
04-17-2025, 03:22 AM
|
#7 |
|
Human being with feelings
Join Date: Dec 2019
Posts: 715
|
works great on OSX so far !
|
|
|
04-17-2025, 06:27 AM
|
#8 |
|
Human being with feelings
Join Date: Feb 2015
Location: Turkey
Posts: 263
|
w h o a !
I promise I won't be surprised next time when there's a new Reaper extension let loose, but you people never ceased to amaze me. Grand work!
__________________
ReaKS / KeySwitch Manager for Reaper - Articulation Manager feature request Mastodon/@Ugurcan - @UgurcanFX |
|
|
04-17-2025, 06:38 AM
|
#9 |
|
Human being with feelings
Join Date: Dec 2012
Posts: 13,842
|
Congratulations!
|
|
|
04-17-2025, 12:54 PM
|
#10 |
|
Human being with feelings
Join Date: Mar 2016
Location: Italy
Posts: 458
|
great job!
|
|
|
04-17-2025, 01:31 PM
|
#11 |
|
Human being with feelings
Join Date: Dec 2019
Posts: 715
|
suggestion: would it be possible to let the list cues follow when playing back ?
|
|
|
04-17-2025, 02:40 PM
|
#12 |
|
Human being with feelings
Join Date: Sep 2022
Posts: 503
|
Hi,
i thought the concept of capturing speech might be useful, when using it with ripped vocal stems m1 macmini , sequoia 15.3 permission granted, vst3 found in the reaper fx menu's so no issues there i cut a 10 second section of vocals from "we are the world" song & splitting the audio on both sides of the item and deleted both sides it processed the whole track as opposed to just the 10 seconds of the remaining item what it did do in the output section however... processed the entire text from vocal stem song but put the timestamps only on the split section upon first use, i didnt know how long the process was going to take... and i tested with the entire vocal stem track of "we are the world", thus i split a smaller section out... thinking it would process faster i probably needed to re-render the section i split and then apply the processing? a approximation of processing time... would be helpful but a very useful tool!! thanks for creating it and sharing it Last edited by 7enz; 04-17-2025 at 02:57 PM. |
|
|
04-17-2025, 03:43 PM
|
#13 |
|
Human being with feelings
Join Date: Sep 2022
Posts: 503
|
after re-testing this script...
is the script pulling a LLM from off the web or somewhere? and saves it local for reuse? im just wondering becos the process button changes to loading model?? which can take up to 25-30 minutes Last edited by 7enz; 04-17-2025 at 03:52 PM. |
|
|
04-17-2025, 06:03 PM
|
#14 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Love the idea! I'll see what I can do. Thanks for the suggestion.
|
|
|
04-18-2025, 08:22 AM
|
#15 |
|
Human being with feelings
Join Date: Mar 2007
Posts: 5,351
|
Here, Reaper recognizes the VST3 plugin, can insert it on a track, but the GUI of the plugin does not appear
![]() (Lenovo Yoga-500, i5, Windows/Tiny 11, REAPER v7.34+dev0316 x64) |
|
|
04-18-2025, 01:57 PM
|
#16 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
Would it work for your use case if the plugin just kept a constantly refreshed data structure in ProjExtState that represents the current state of the transcript? It seems like you would be able to write a ReaScript that copies this state somewhere, and you could control the lifecycle of those copies without the plugin needing to provide some sort of clear/reset API. Hope that makes sense. Another question is which state to persist - the original transcript data will be relative to the source audio, but the transcript displayed in the grid shows constantly refreshed timestamps relative to the media's placement on the project timeline. The latter seems more useful. Furthermore, the grid has filtering options, and in some cases it might be useful to work with the filtered data. Thanks for your feedback! |
|
|
|
04-18-2025, 02:00 PM
|
#17 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
I can add a progress bar to the download process, so at least you have an idea how long of a wait to expect. |
|
|
|
04-18-2025, 04:47 PM
|
#18 |
|
Human being with feelings
Join Date: Aug 2013
Location: Ukraine
Posts: 106
|
super, now there is no problem with Cyrillic. Before it seemed impossible to me, but now it seems real: to make a script that would check the original text of the speaker with the read text for errors, and mark them with markers. Is it possible to realise this? Like load the original text into the plugin, it analyses it and creates tokens indicating the differences
|
|
|
04-19-2025, 01:29 PM
|
#19 | |
|
Human being with feelings
Join Date: May 2017
Location: Somewhere over the Rainbow
Posts: 6,966
|
Quote:
For instance, by setting a dedicated extstate to true. Also interesting: I can imagine using it in our Ultraschall-software to do transcription of the individual tracks, so I would add it to several tracks and run dry rendering for it. Would it be possible to add the tracknumber, at which the plugin is put into the FXChain as well into the extstate? Like: "track=1;startpos=113.456;endpos=114.123;I am the transcribed text"? (For voice control, I would probably use it from monitoring fx, though I'm not sure, if you can find out, that you are on monitoring fx with the fx to signal it in the extstates as "track=monitoring".) And the extstate section "ReaSpeech" key "reset" set to true makes you delete all extstates start from the first extstate again(you would need to set the reset-extstate to "" as well). Thnx for considering the changes Btw, which licenses does ReaSpeech Lite have? This should do a lot of what I need.
__________________
Use you/she/her.Ultraschall-Api Lua Api4Reaper - ReaGirl - a GuiLib for guis working for blind people |
|
|
|
04-20-2025, 01:49 AM
|
#20 |
|
Human being with feelings
Join Date: Apr 2013
Location: France
Posts: 11,131
|
Impressive work !
For storing data, maybe not proj extstate but track exstate, as it is per track ?
__________________
Free ReaScripts - Premium Scripts - Custom Scripts Dev - Learn ReaScript - XR Theme - Stash Files - ReaLinks - ReaComics - Donation |
|
|
04-20-2025, 07:52 AM
|
#21 | |
|
Human being with feelings
Join Date: May 2017
Location: Somewhere over the Rainbow
Posts: 6,966
|
Quote:
![]() Just wondering, where to store it, when using it as global monitoring fx...
__________________
Use you/she/her.Ultraschall-Api Lua Api4Reaper - ReaGirl - a GuiLib for guis working for blind people |
|
|
|
04-20-2025, 08:47 AM
|
#22 |
|
Human being with feelings
Join Date: Apr 2013
Location: France
Posts: 11,131
|
@meo-ada mespotine
Ara plugins works on item file sources on tracks, so it is not meant for monitor FX 😉
__________________
Free ReaScripts - Premium Scripts - Custom Scripts Dev - Learn ReaScript - XR Theme - Stash Files - ReaLinks - ReaComics - Donation |
|
|
04-20-2025, 08:56 AM
|
#23 |
|
Human being with feelings
Join Date: May 2017
Location: Somewhere over the Rainbow
Posts: 6,966
|
I'm talking about the VST, which is the one I would use.
__________________
Use you/she/her.Ultraschall-Api Lua Api4Reaper - ReaGirl - a GuiLib for guis working for blind people |
|
|
04-22-2025, 03:50 AM
|
#24 |
|
Human being with feelings
Join Date: Aug 2013
Location: Ukraine
Posts: 106
|
If you process a large number of items, then recognition occurs partially, with random blocks being skipped, and you have to do gluing.
|
|
|
04-22-2025, 09:40 AM
|
#25 |
|
Human being with feelings
Join Date: Oct 2009
Location: France
Posts: 838
|
tried on a radio voiceover, impressive bravo !!
|
|
|
04-22-2025, 11:26 AM
|
#26 |
|
Human being with feelings
Join Date: Jan 2012
Location: Germany
Posts: 1,230
|
Wow, certainly very useful for podcast editing and the like. Especially through the integration of regions, markers and notes. Thanks a lot for sharing your wonderful work. Very much appreciated!
|
|
|
04-22-2025, 12:42 PM
|
#27 |
|
Human being with feelings
Join Date: Oct 2019
Posts: 10
|
This is awesome! Thank you so much for making it so much easier to get up and running than the original ReaSpeech. I love how streamlined and user friendly ReaSpeech Lite is.
I'm not sure if this is just due to my system, but transcription seems to take an exceptionally long amount of time, even after the models are downloaded. This would be an incredibly useful tool for me, but the transcription would just take too long to run, especially for multiple files. On my system, it seems like it's taking about 5x the duration of the file to generate the transcript. Would it be possible to add a way to "abort" the transcription in progress? I started running a transcription on a 1 hour long file and when it was evident that it would be taking a very long time, I tried to remove ReaSpeech from the FX chain and it crashed REAPER. |
|
|
04-22-2025, 02:47 PM
|
#28 |
|
Human being with feelings
Join Date: Apr 2014
Posts: 425
|
Incredible work tadave!
❤️ |
|
|
04-23-2025, 05:36 AM
|
#29 |
|
Human being with feelings
Join Date: Dec 2014
Posts: 28
|
Are you sure it isn't using it? I installed the normal version first and ran some tests which were quite slow. I then installed the CUDA toolkit and the CUDA version of this plugin. There is still only one version of the plugin appearing, but it runs a lot faster, and I can see it is using my GPU when processing. i think they have the same name so the CUDA version overwrites the standard version.
|
|
|
04-23-2025, 06:46 AM
|
#30 | |
|
Human being with feelings
Join Date: Jun 2006
Posts: 22,772
|
Quote:
I just installed it again and I think it works, I just had to install it after the regular one? Last edited by Jae.Thomas; 04-24-2025 at 01:54 AM. |
|
|
|
04-29-2025, 10:49 AM
|
#32 |
|
Human being with feelings
Join Date: Jun 2021
Location: Moscow, Russia
Posts: 520
|
Reaper just doesn't see Cuda version here as well. Installing it over basic version for Windows gave no result. Reaper in that case doen't see it at all. Basic version in turbo mode works much slower (maybe 10 times) than Subtitle Edit's turbo mode to compare with. I don't know if Subtitle Edit uses cuda while recognizing audio. Anyway thanks for great ability and starting development of such a great plugin for Reaper!
|
|
|
04-30-2025, 04:04 AM
|
#33 |
|
Human being with feelings
Join Date: Dec 2011
Location: Norway
Posts: 110
|
Windows 11, latest
I got ReaSpeech to get text, nynorsk / Norway, with WindowsCuda. Create work ok. But no start information, no end information, and linking from text to Reaper does not work. I see some "javascript". Do I need some kind of Java? Browser is MSedge. Last edited by sveinpetter; 05-01-2025 at 05:42 AM. |
|
|
05-01-2025, 09:17 AM
|
#34 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
I imagine this is due to the embedded WebView that JUCE provides not being compatible with Tiny 11. This is the first I've heard of Tiny 11, so I have no idea if there's a workaround. If we were to get past that issue, I'm pretty sure the speech recognition performance would be too slow to be usable on that hardware. Thanks for trying it out, though!
|
|
|
05-01-2025, 09:19 AM
|
#35 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
ReaSpeech Lite is licensed under the AGPLv3, which is the only non-commercial option for a JUCE 8 plugin.
|
|
|
05-01-2025, 09:23 AM
|
#36 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
I'll be publishing an update soon that adds cancellation. Thanks for the suggestion! |
|
|
|
05-01-2025, 09:27 AM
|
#37 |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
For those experiencing issues with the CUDA version not showing up in the FX browser, it would be helpful to know a few things:
Thanks! |
|
|
05-01-2025, 09:29 AM
|
#38 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
You should not need Java. You're seeing "javascript:" because the user interface is essentially a web page running in an embedded web browser, and that's what Edge displays when hovering over a hyperlink. I might be able to hide that message, but it's expected behavior at the moment. |
|
|
|
05-01-2025, 09:39 AM
|
#39 | |
|
Human being with feelings
Join Date: Sep 2024
Location: Phoenix, AZ
Posts: 19
|
Quote:
Since this is an ARA plugin, it behaves a bit differently from a regular VST. You can add the plugin to multiple tracks, and in REAPER you'll get a separate plugin instance for each track, but each plugin will be able to see all of the media items. Right now, it behaves a bit weirdly in that case, because the plugin does not use ARA to persist its state, but rather uses the older VST mechanism, such that each instance gets its own state. With ARA, the state is more of a project-level thing, and it would theoretically be possible for each instance to show the same data, and receive event updates when another instance updates that data. It's also possible to add the plugin to individual media items instead of tracks. Those media items can span multiple tracks. Similar issues occur with state management. So, I'll have to mess around with this some more, and may need to make changes to how the plugin manages persistence with respect to multiple plugin instances. That said, it still seems feasible to split up the transcript per-track, and this could be done with either ProjExtState or P_EXT as long as the data model makes sense given the design of ARA. |
|
|
|
05-02-2025, 06:47 AM
|
#40 | |
|
Human being with feelings
Join Date: Dec 2011
Location: Norway
Posts: 110
|
Thanks, all ok now.
Quote:
I tried to install the Cuda version many times, first with no success. It did not show up in Reaper. Using unistall every time, and the basic version did always show up. The Cuda .vst3 file was located on disk, but not by Reaper. As soon I got the Cuda Toolkit installed, without errors, my next try with WindowsCuda , all ok. On windows run cmd / terminal: nvidia-smi and you get info about the NVIDIA card On my Lenovo Thinkpad P53 with T2000 videocard, i had to install MICROSOFT VISUAL STUDIO. Then I needed to install Nsight System first, downloaded from NVIDIA. Next step, install Cuda Toolkit with deselected Nsight System and deselected drivers. No more errors, and a new install of CudaWindows showed up in Reaper. A Norwegian Whisper Model for Automatic Speech Recognition no.ntnu_inspera_145904930_90195284.pdf Last edited by sveinpetter; 05-02-2025 at 07:42 AM. |
|
|
|
![]() |
| Thread Tools | |
|
|