In this work, we propose an AR system for detecting early signs of dementia by recognizing human-object interactions with the combination of the HoloLens 2 and, an existing action estimation architecture based on the simultaneous detection of the hands and the objects in the current scene using YOLO.
First, install Unreal Engine 4, version 2.27 or upper.
# clone project
git clone https://github.com/3dperceptionlab/Holo4Care.gitCompile to generate Visual Studio solution files. To deploy in the device, generate the package and upload (.appxbundle and select "Allow me to select framework packages" and upload the .appx file) to the Mixed Reality Portal. If it is the first time, you will need to generate a key. To do so, in UE4, access Edit>Project Settings>Platform>HoloLens, in Packaging/Signing Certificate: Generate New.
This folder contains the C++ components that orchestrate image capture, data transmission, and prediction visualisation inside Unreal Engine:
- Registers the project’s primary module through
IMPLEMENT_PRIMARY_GAME_MODULE, giving Unreal the entry point it needs to load the game.
- Declares the base game mode. The class itself is minimal because runtime logic is delegated to specialised actors such as the HTTP interface and the prediction objects.
Actor in charge of talking to the REST API that processes the frames captured by HoloLens.
MakePost(UTextureRenderTarget2D*, FString, int): exports the camera render target to PNG withFImageUtils::ExportRenderTarget2DAsPNG, Base64-encodes the buffer, and builds amultipart/form-datapayload viaAddData. It sends thePOSTrequest to the configured endpoint and starts a timing trace for debugging.AddDataandFStringToUint8: helper utilities that wrap each field of the payload using the boundary markers required by the multipart protocol.OnResponseReceived(...): handles thePOSTresponse. When the server returns a JSON containing agetURL, the method enqueues that endpoint so predictions can be pulled later.GetJSONItems(): dequeues pending URLs, issues the correspondingGETrequests, and inspects HTTP status codes. A200 OKresponse passes control toProcessJSONtoObject, which spawns actors for each detection.ProcessJSON(...)andProcessJSONtoObject(...): parse the prediction JSON into either lists of positions or fully fledgedAPredictionObjectinstances populated with class labels, actions, and bounding boxes.ProcessJSONtoObjectalso configures their visuals and refreshes thepredictionObjectsarray.AddNewPredictionObjets(...): merges newly spawned detections with previously spawned ones, scales widgets according to camera distance, keeps visibility in sync, and prunes stale actors to keep the collection bounded.
Taken together, this actor drives the entire data loop: capture the render target, post it to the backend, wait for the result URL, and rebuild the scene with the received predictions.
Defines the visual actor that represents each prediction in the 3D scene.
- Constructor: creates the mesh (
UStaticMeshComponent) and text (UTextRenderComponent) elements, wiring them to a tactileUUxtPressableButtonComponentthat toggles visibility. ConfigNode(): cleans up the label text (getCleanText), adjusts scale/offset for the text components, and shows up to two lines of suggested actions.onButtonPressed,onButtonBeginFocus,onButtonEndFocus: handle user interaction (toggle action visibility and recolour the node when the user focuses it).destroy(): safe wrapper aroundDestroy()used byHTTPInterfacewhen recycling actors.
Actor devoted to low-level TCP socket experiments. It lets you send or receive images outside the HTTP workflow.
CreateServer()andCreateClient(): configure sockets, apply options such asSO_REUSEADDRandSO_REUSE_UNICASTPORT, and bind/connect to the configured IP and port.SendImage(): reads a local file (test.png), sends its size first, and then streams the raw bytes to the connected server.ReceiveImage(): accepts inbound connections, rebuilds the received byte buffer, and saves it to disk (test_server.png).BeginPlay()andEndPlay(...): initialise the instance as either server or client depending onIsRunningDedicatedServer()and close sockets cleanly on shutdown.
Provides static helpers for integrating external scripts and pipelines.
ExecuteSystemCommand(...): launches operating system processes viaFPlatformProcess::ExecProcessand returns the captured standard output.LoadFile(...): reads binary files from disk into aTArray<uint8>(used for textures produced by Python routines).SendDataToServer(...)andReceiveDataFromClient(...): early sketches of Unreal networking (UNetConnection) integration for sharing textures between client and server.SaveTextureToArchive(...),LoadTextureFromArchive(...), andSaveTextureToFile(...): serialiseUTexture2Dassets to binary streams so they can be transmitted or persisted.
HTTPInterfacecaptures the current frame, uploads it to the API, and requests the results.- When detections arrive,
ProcessJSONtoObjectspawnsPredictionObjectactors that display class labels and suggested actions in 3D. NetInterfaceandPythonAPIoffer alternative ways to move data around (raw sockets and external processes) for experimenting with AI pipelines outside the engine.
- Manuel Benavent-Lledo (mbenavent@dtic.ua.es)
- David Mulero-Perez (dmulero@dtic.ua.es)
