I heard a doubled vocal, as well as occasional really quiet harmonies, and obviously there are some delayed parts after 'what its like' that are a different vocal take than the real vocal.
A trick for doubled vocals like this is to compress the second one more so it doesn't peak out above the real one. Additionally, filter out a bit of the low end so that plosives and consonant starts like 'te' and 'ke' don't pop through noticeably at different times in both tracks.
Beyond that, it helps to get a great in-sync recording to begin with but some small edits to line up words really goes a long way to hide the second the vocal.
|