Discussion:
Sursound Digest, Vol 124, Issue 1
Add Reply
Brian FG Katz
2018-11-03 15:19:34 UTC
Reply
Permalink
The sync issue to video raises a number of questions, for which the question of a tracker and reaper HOA decoder are secondary to me.

First, is the video individual or projected? Is this HMD/GoogleCardboard type situation? If the first, what are the audio/video sync latency tolerances for the piece?

Second, is the sound scene a "shared" experience, in that do the actions of one person affect the sound heard by another? Is the scene dynamically generated/rendered in real-time or is this basic off-line rendered audio HOA playback? Single point of view in the sound scene, or does each listener have their own view, and it is fixed or moving with the user's position?

If the audio is pre-rendered, without data upload from the user to the server, then a local HOA player started by a sync signal should suffice. This is far easier then hosting the audio on a server (REAPER) for which the tracker info needs to get from the user to the server, decoded, then sent back. Having more than just a few wireless streams can be difficult, and broadcast streaming of HOA has issues, as true "broadcast" doesn't exist in WiFi the same as it does for Wired LAN (i.e. don't bet on it being able to send audio with low latency).

We had some similar questions on a project almost a decade ago now (see references below) where a shared sound-scene was to be shared to multiple users in a common (gallery) setting. The solution was local mini-PCs decoding the Ambisonic stream with the local head-tracker. If there is no upstream data, and only "play" commands, this would be a better solution in my mind.

- N. Mariette, B. Katz, K. Boussetta, and O. Guillerminet, “SoundDelta : a study of audio augmented reality using WiFi-distributed Ambisonic cell rendering,” in Audio Eng Soc Conv 128, (London), pp. 1–15, 2010.
- N. Mariette and B. Katz, “SoundDelta - Largescale, multi-user audio augmented reality,” in EAA Symp. on Auralization, (Espoo), pp. 1–6, 2009.
- http://remu.fr/remu/realisations/sound-delta/

For the plug-in, if the tracking is coming in via OSC, or midi, or most other means, I would think that you could redirect this control signal to different HOA-binaural converters (if you use a client-server architecture and send the binaural audio from the server).

Of course, if you want to go beyond HOA and do dynamic binaural rendering, I offer a small bit of promo for Anaglyph, our research developed binaural VST plug-in: http://anaglyph.dalembert.upmc.fr/

Best of luck,
--
Brian FG Katz, Ph.D, HDR
Research Director, CNRS
Groupe Lutheries - Acoustique - Musique
Sorbonne Université, CNRS, UMR 7190, Institut Jean Le Rond ∂'Alembert
bureau 510, 5ème, allé des tours 55-65
4, Place Jussieu
75005 Paris
Tel: (+33) 01 44 27 80 34
web_perso: http://www.dalembert.upmc.fr/home/katz
web_lab: http://www.dalembert.upmc.fr
web_group: http://www.dalembert.upmc.fr/lam


-----Original Message-----

From: Simon Connor <***@gmail.com>
To: ***@music.vt.edu
Subject: Re: [Sursound] Multi-user head tracked binaural ambisonics
query
Message-ID:
<***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thanks for the swift reply Lorenzo! This is a great start.

I shall check these out, one key thing I should have mentioned is that the audio feed needs to be synced to video (hence the Reaper functionality - as this will simultaneously run the audio and video). So the audio will need to be streamed to all users synchronously. I'm not sure if this has an implication on the use of mobile phones or not?
Cheers
Simon
Simon Connor
2018-11-05 20:35:03 UTC
Reply
Permalink
Thanks Brian

I will definitely check out the project and paper you mentioned as this
sounds like a similar thing. In regards to your other queries:

*First, is the video individual or projected? Is this HMD/GoogleCardboard
type situation? If the first, what are the audio/video sync latency
tolerances for the piece? *

The video will be a single large scale projection rather than an individual
VR / Google cardboard headsets. The audio will be in sync with image. Ive
not had chance to test yet so not definite on tolerable latency but at a
guess up to a couple of miliseconds if possible.

*Second, is the sound scene a "shared" experience, in that do the actions
of one person affect the sound heard by another? Is the scene dynamically
generated/rendered in real-time or is this basic off-line rendered audio
HOA playback? Single point of view in the sound scene, or does each
listener have their own view, and it is fixed or moving with the user's
position? *

The sound scene is played back by off line rendered audio. A fixed/linear
3OA audio stream, which I would then like to be decoded binaurally to each
users HP, for now this will be fixed rather than constantly moving.

This audio stream would be the same for each each user - a binaural mix but
modified depending on their own unique head movements . So the data flow
would be; audio and video played in sync, audio sent potentially wirelessly
to approx 6 pairs of headphones with headtracker, each user's headtracker
sending data back to the individual binaural decoder, and then each user
receiving their own modified stream from this.

I realise this may be too ambitious but would like to just gauge peoples
experiences to see if it will actually be possible, and the recommended
technical implementation.
So thanks all for the suggestions so far!

Cheers
Simon

On Sat, 3 Nov 2018 at 15:20, Brian FG Katz <***@ida.upmc.fr> wrote:

> The sync issue to video raises a number of questions, for which the
> question of a tracker and reaper HOA decoder are secondary to me.
>
> First, is the video individual or projected? Is this HMD/GoogleCardboard
> type situation? If the first, what are the audio/video sync latency
> tolerances for the piece?
>
> Second, is the sound scene a "shared" experience, in that do the actions
> of one person affect the sound heard by another? Is the scene dynamically
> generated/rendered in real-time or is this basic off-line rendered audio
> HOA playback? Single point of view in the sound scene, or does each
> listener have their own view, and it is fixed or moving with the user's
> position?
>
> If the audio is pre-rendered, without data upload from the user to the
> server, then a local HOA player started by a sync signal should suffice.
> This is far easier then hosting the audio on a server (REAPER) for which
> the tracker info needs to get from the user to the server, decoded, then
> sent back. Having more than just a few wireless streams can be difficult,
> and broadcast streaming of HOA has issues, as true "broadcast" doesn't
> exist in WiFi the same as it does for Wired LAN (i.e. don't bet on it being
> able to send audio with low latency).
>
> We had some similar questions on a project almost a decade ago now (see
> references below) where a shared sound-scene was to be shared to multiple
> users in a common (gallery) setting. The solution was local mini-PCs
> decoding the Ambisonic stream with the local head-tracker. If there is no
> upstream data, and only "play" commands, this would be a better solution in
> my mind.
>
> - N. Mariette, B. Katz, K. Boussetta, and O. Guillerminet, “SoundDelta : a
> study of audio augmented reality using WiFi-distributed Ambisonic cell
> rendering,” in Audio Eng Soc Conv 128, (London), pp. 1–15, 2010.
> - N. Mariette and B. Katz, “SoundDelta - Largescale, multi-user audio
> augmented reality,” in EAA Symp. on Auralization, (Espoo), pp. 1–6, 2009.
> - http://remu.fr/remu/realisations/sound-delta/
>
> For the plug-in, if the tracking is coming in via OSC, or midi, or most
> other means, I would think that you could redirect this control signal to
> different HOA-binaural converters (if you use a client-server architecture
> and send the binaural audio from the server).
>
> Of course, if you want to go beyond HOA and do dynamic binaural rendering,
> I offer a small bit of promo for Anaglyph, our research developed binaural
> VST plug-in: http://anaglyph.dalembert.upmc.fr/
>
> Best of luck,
> --
> Brian FG Katz, Ph.D, HDR
> Research Director, CNRS
> Groupe Lutheries - Acoustique - Musique
> Sorbonne Université, CNRS, UMR 7190, Institut Jean Le Rond ∂'Alembert
> bureau 510, 5ème, allé des tours 55-65
> 4, Place Jussieu
> 75005 Paris
> Tel: (+33) 01 44 27 80 34
> web_perso: http://www.dalembert.upmc.fr/home/katz
> web_lab: http://www.dalembert.upmc.fr
> web_group: http://www.dalembert.upmc.fr/lam
>
>
> -----Original Message-----
>
> From: Simon Connor <***@gmail.com>
> To: ***@music.vt.edu
> Subject: Re: [Sursound] Multi-user head tracked binaural ambisonics
> query
> Message-ID:
> <
> ***@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Thanks for the swift reply Lorenzo! This is a great start.
>
> I shall check these out, one key thing I should have mentioned is that the
> audio feed needs to be synced to video (hence the Reaper functionality - as
> this will simultaneously run the audio and video). So the audio will need
> to be streamed to all users synchronously. I'm not sure if this has an
> implication on the use of mobile phones or not?
> Cheers
> Simon
>
>
> _______________________________________________
> Sursound mailing list
> ***@music.vt.edu
> https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here,
> edit account or options, view archives and so on.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.music.vt.edu/mailman/private/sursound/attachments/20181105/a0748279/attachment.html>
Picinali, Lorenzo
2018-11-05 21:20:30 UTC
Reply
Permalink
Dear Simon,


after having read the description of what you want to do, I thought about another solution.


We are currently working at an iOS implementation of the 3D Tune-In Toolkit. It's an HRTF-based binaural spatialisation engine, so not an HOA-to-binaural converter, but...if you decode the HOA stream to an array of loudspeakers and then render each of them as an individual source in the binaural spatialiser, it will work. It currently works on an iPad Pro 2, and it can handle 25-30 anechoic sources at the same time (without binaural reverberation so...anechoic spatialisation). The head tracking is done using the sensors on the phone/tablet, and each source can be played/stopped remotely via OSC. All sources positions can be imported in the application through a configuration file (json), while the sound files need to be located in the mobile device (but you can also read them from Dropbox or other online storage). It's just a prototype at the moment, but it works already very well. The iPad Pro 2 is a rather powerful machine, but current iPhones and iPads will do even better.


So...in this case, you'll just need to synchronise the OSC play/stop with the video, and it will play remotely on every user's device.


If interested, let me know and we'll find a way for you to try it (TestFlight or similar).


Best regards
Lorenzo




--
Dr Lorenzo Picinali
Senior Lecturer in Audio Experience Design
Director of Undergraduate Studies
Dyson School of Design Engineering
Imperial College London
Dyson Building
Imperial College Road
South Kensington, SW7 2DB, London
T: 0044 (0)20 7594 8158
E: ***@imperial.ac.uk

http://www.imperial.ac.uk/people/l.picinali

www.imperial.ac.uk/design-engineering-school<http://www.imperial.ac.uk/design-engineering-school>
________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.music.vt.edu/mailman/private/sursound/attachments/20181105/22707653/attachment.html>
Loading...