Hello folks! In this article, let’s have a look at one of the most important features of an AR environment. Spatial mapping provides a way to represent the real world surfaces around us in a mixed reality environment in a very convincing manner. Microsoft, of course has done a very good job of explaining the theoritical concept behind it here: https://docs.microsoft.com/en-us/windows/mixed-reality/spatial-mapping
In this tutorial, I will show you how you could use the HoloToolkit for applying Spatial Mapping in your project. We will also look at what Spatial Understanding is and how we could use it.
Let’s first look at what Spatial understanding is:
While Spatial mapping provides a way to represent real-world surfaces in our MR environment, Spatial understanding extends this representation to identify the surfaces in the MR environment. This means it will make decisions on what it perceives as floors, ceilings, walls, flat surfaces etc.
The Spatial Understanding library provided in the HoloToolkit enables us to know where we could actually place the objects “intelligently” in our scene so that it can appear to lie correctly on surfaces like tables, chairs etc.
Ok, let’s get started then.
Starting with the usual things:
1. Create a new project. I have Unity version 2018.3.12f1
2. I’m using the latest stable release of the HoloToolkit 2017.4.3.0 Import it into your project (lot many have asked me questions on the MRTKv2, however it is still in an unstable version and therefore I don’t recommend using it for reliable functionality. It’s however great to experiment with it :-))
3. Do the usual tweaks:
a. Delete the main Camera. Add the MixedRealityCamera.prefab in the Hierarchy. This will serve as the Main Camera
b. Change the Camera Settings to Hololens preferred:
Skybox to Solid Color
Background to 0,0,0,0
In the MixedRealityCameraManager script which is attached to the MixedRealityCamera object, change the Clear Flags value to Solid Color
Ensure that the Background Color is black
c. Go to Edit ->Project Settings -> Player -> Other Settings -> and change the Scripting Runtime version to .NET 4.x Equivalent and Scripting Backend to .Net. Ensure that UWP is selected as the Build Platform.
Make sure also that in XR Settings, Virtual Reality Supported is checked
We need both the InputManager and the DefaultCursor in the scene.
4. InputManager
Drag and drop the InputManager.prefab into your scene
5. DefaultCursor
Drag and drop the DefaultCursor.prefab into your scene. In the InputManager’s Inspector Settings, drag and drop the DefaultCursor into the Cursor field of the SimpleSinglePointerSelector script
d. Save the scene
Create an empty GameObject called Managers so that you can put all your manager related objects under it.
Now for the SpatialMapping part:
Look for the SpatialMapping prefab in the HoloToolKit: SpatialMapping. Add this onto as a child object to the Managers.
Notice that the SpatialMapping prefab has three scripts already attached to it. A quick explanation of the three scripts are as follows are provided in the Microsoft pages here: https://docs.microsoft.com/en-us/windows/mixed-reality/spatial-mapping
In the Spatial Mapping Observer, the number of triangles per cubic meter is where you can choose how many triangles you’d like the mesh to create in the environment. The more you enter here, the more processing time it takes. However, I have noticed that around 2000 or so is decent enough and doesn’t waste so much of your HoloLens processing time.
I’d recommend not changing any other values at least if this is your first attempt at Spatial Mapping.
In the Spatial Mapping Manager, the surface material is default chosen as Wireframe. This is the usual Mesh you’d see when the HoloLens is first started and it scans the environment. If you do select Draw Visual Meshes, it will show you the mesh around. I like to have this option checked just to see how the environment scanning around is drawing the mesh as I start to look around. It gives you visual feedback that the mesh indeed is working. So let’s keep that checked.
Let’s try to see what the SpatialMapping result is. Before you build, go to your Player Settings -> Publishing Settings -> Capabilities and check Spatial Perception. This is key for both Spatial Mapping and Spatial Understanding to work.
Now build and deploy onto the HoloLens and see what happens. If everything is rightly configured, you should be able to see the mesh around you extending itself onto the environment when you move and look around.
The Spatial mapping video output is shown below:
Now for the Spatial Understanding part:
Search and drop the prefab Spatial Understanding from the HoloToolkit and add this as a child object onto Managers.
I kept the Inspector settings on it as is.
Make sure Auto Begin Scanning is checked. The Mesh material is already set to SpatialUnderstandingSurface and is a default blue. This will help you see the mesh in blue. Keep also the Create Mesh Colliders checked. This will ensure that it interacts very well with the environment around.
Now for an important change. You don’t want two meshes to be in the scene. We had an earlier one from SpatialMapping. Let’s disable that. Go back to the SpatialMapping Inspector settings and uncheck Draw Visual Meshes.
For the next part:
I’d like to trigger the whole spatial understanding with a tap. I’d also like to see what objects the mesh identifies when it interacts with the real world environment. This I will do via a script. Go to the Managers and create an empty GameObject. Call it RoomSetupManager or something similar.
Add a script to it. Call it RoomSetup. Edit the script in Visual Studio. I will adapt the part of the code from MixedRealityExamples and a Southworks article here: https://medium.com/southworks/how-to-use-spatial-understanding-to-query-your-room-with-hololens-4a6192831a6f
The RoomSetup script will include the following features:
1. To auto start the Spatial Understanding function
2. To finish the Spatial Understanding function when the scanning is over with a tap gesture
3. To query the type of the surface with a raycast in our Update() method triggered by a speech command like “Ray”
Let’s look at this one by one.
1. For the scanning of the spatial mapping, in our Start() method, add the RequestBeginScanning() from the Spatial Understanding
We also need to add the corresponding LogSurfaceState() method along with the Update() and ScanStateChanged() methods to handle completed scans
2. For the second part, we need to include the code to register a tap (View full article on how to use tap gestures https://codeholo.com/2018/01/26/using-gaze-and-tap-to-select-the-objects-of-your-choice/ ) and call the RequestFinishScan() method accordingly. So include an OnInputClicked() method and call the RequestFinishScan() method from there.
We need a visual feedback as to what is being identified as an object and also when the scan is started and when it is finished. So let’s add a TextMesh to the code. Go back to the project and add a 3D Text Object to the scene. Call it Visual Feedback or something similar. Adjust it’s parameters as follows. The following screenshot shows what settings worked best for me.
Now go back to the RoomSetupManager and in it’s RoomSetup script Visual Info drag and drop the Visualfeedback text mesh.
This needs to move with the user as the user scans the room. And always face the user as well. So let’s add the Tag Along and BillBoarding (Here’s the full article on how to do that: https://codeholo.com/2018/12/24/tag-along-billboarding-and-other-menu-elements-for-hololens-apps/ ) scripts to it.
Before we step into 3, let’s already test these two out. Let’s build and deploy this. If everything works fine, you should see a blue mesh and the text which tags along and faces you. The text should read something like TotalSurfaceArea: .., WallSurfaceArea.. and so on and it should regularly update itself. Look around the room and see the blue mesh extending itself. The more you see, the more this Spatial Understanding is trying to figure out the room. Once you are satisfied, tap and it should say “Finishing scan”. Notice how the blue mesh joins the empty spaces and creates a virtual room with what it has scanned. Once it is done, which is in a matter of seconds, the message should say “Scan Finished”
For part 3:
we need to see when the blue mesh interacts with your gaze Raycast, it identifies the type of object it intersects with. For that we need to emit a Raycast whenever we use a speech command called “Ray” and get the Surface Type. So if we look at the floor, the VisualInfo text should say Type of Surface : floorLike, Type of Surface: Ceiling, etc
Note that there are limited type of surfaces the Spatial Understanding can recognize. It is also in my experience not that robust. I.e the surface type is not identified well. But it works to a large extent.
Let’s include this part in our script:
First add an empty GameObject called SpeechManager under Managers. Add the SpeechInputManager and SpeechInputSource to it. Add the speech command “Ray” and check Is Global Listener. (Read the full article here on how to use speech commands with MRTK : https://codeholo.com/2017/12/03/how-to-use-voice-input-in-hololens/ )
In our RoomSetup script, first add a private variable for raycastResult.
Next we will write our RayCast checking in a function called RayCastChecker(). In this function, we will make the Camera send out a RayCast in the forward direction and check what type of surface it is via the SpatialUnderstanding GetStaticRaycastResultPtr() function. All this, only when the SpatialUnderstanding ScanStates is Done. You don’t want to start checking before SpatialUnderstanding is ready. What I also do is use our visualInfo to display the surface type. But since this is in our Update() method, we will add a flag and call it textInfoDisplay. The purpose of this flag is to display the text “Scan Finished” only when it is set false. As soon as we say “Ray”, the flag is made true and that time, the Update() is not setting this visualInfo text to “Scan Finished” but setting this to the surface type result.
The code screenshot is shown below.
Attached also is a snippet of the Update() function with the flag check
Go back to the SpeechManager and assign the RaycastChecker() function in the SpeechInputHandler
Don’t forget to enable the Microphone Capability in the Publishing Settings -> Capabilities -> Microphone
Allright, now just build and deploy and try it out. Once you tap, you should see “Scan Finished” displayed as soon as SpatialUnderstanding has done it’s scan job. Then you say “Ray” and you can look at different objects. The type of the surface is detected and it is displayed as a text. Below is a video of the output. (Sorry about the battery low popup in between). You’ll see that while scanning I move my head quite a bit to take in the environment to be covered by the blue mesh. Bear in mind that the Spatial Understanding is not all that great and do not expect precise results from it. It did not recognize the floor when I took this video, but a more detailed scan (2-3 minutes of the room) produced a good result. I have deliberately included the wrong results in it to show you that this is not that reliable. It however does identify the wall, ceiling and platform.
The full Git project is here: https://github.com/NSudharsan/HoloLensExamples/tree/master/SpatialMappingAndUnderstandingExample
Video output is shown below:
Errors and Solutions:
Error: Spatial Mapping does not do anything
Solution: Check if the Spatial Perception in the Capabilities is checked
Error: Voice command not responsive
Solution: Make sure Microphone is enabled in Capabilities
Error: Exact surface Type is not identified
Solution: Scan the room more for Spatial Understanding to achieve best results. It is however in my experience not that robust.