UCL MotionInput 3 - AI PC apps for Windows
In collaboration with project supervisors from Microsoft, Intel and IBM.
UCL MotionInput v3 is our latest AI PC software for Touchless Computing interactions. It is a means of interaction with a PC without the need to touch it, with just a webcam. A user interacts with this software on their PC via gestures with their hands, head, face, full body and their speech. The software analyses interactions and converts them into mouse, keyboard and joypad signals making full use of your existing software. The software was developed by academics and students at University College London's Department of Computer Science.
The AI PC software contains:
- Hands-based tracking modes - such as in-air multitouch, mouse, keyboard, digital pen and joypad
- Facial Navigation modes - mixing facial switches with nose and eyes navigation
- Exercises in gaming modes - users place hot-spot triggers in the air around them, along with combinations of "walking on the spot" recognition, for first person and third person gaming, retro gaming and more.
- Simultaneously recognising speech alongside all of the above - for mouse events like "click", for app events like "show fullscreen" in PowerPoint, for operating system events like "volume up" and in your own phrases in your games and applications - along with live captioning.
- See our demonstrations here on our Youtube playlist!
UCL CS Academics and Students
Apps developed and growing
Machine Learning, NLP and Computer Vision AI PC embedded environments
Years of Development
The solution is fully customisable with mixed machine learning and computer vision models for users intentions. From education settings in schools, to safer computing in hospitals for both patients and staff, to gaming with exercises and entertainment access at home.
Using your nose or eyes, and a set of facial expressions to trigger actions like mouse button clicks, or with speech - say "click".
A powerful selection of hand gestures that can be recognised and mapped to specific keyboard commands, mouse movements, native multitouch sensing, digital pens (with depth in the air!) and light guns.
An auto-calibration method for eye-tracking that obtains the gaze estimation including both a grid mode and magnetic mode, for aligning the mouse cursor in accessibility scenarios
This mode allows users to engage their games within their own ranges of movements. Users can set physical exercises and tag regions in their surrounding space, to play their existing computer games
Ask-KITA allows users to interact with a computer from a set of voice commands, live captioning and overriding keyboard shortcuts with phrases
This mode enables user to play games with usual ABXY joypad buttons in the air, with analogue trigger controls.
UCL MotionInput 3 is a suite of assistive technology software solutions for public, research, industrial and commercial applications of touchless computing. The software is being built in separate purpose apps and is free-to-use licensed for individual personal use with University College London. For commercial and embedded solutions, the system has a core API that UCL-registered commercial developers can license from UCL to make use of to build touchless applications with, as well as easily customise solutions for existing software.
Here are some of its key technical features all of which are available freely in the individual personal users public release:
In-air native Windows Multitouch, with two touchpoints (as pinches) registered with the Windows 10/11 OS (available).
- Simultaneously, you could pinch the air to click, or with speech mode on, just say "click", "double click" and "right click" for enacting the mouse clicking.In-air native Digital Inking compatible with Microsoft Surface pen applications and Windows Ink, with depth sensing in the air, both with and without hand-held instruments such as pencil, pens etc.In-air Mouse for industrial and clinical touchless computing applicationsTwo handed digital and analogue in-air joypad buttons and triggers simulating popular joypad movement controlsFederated speech with Ask-KITA speech hotkey shortcuts ("print", "copy", "show fullscreen"), and live caption dictation into text editors (MS word), applications with text fields (Teams, Zoom), browsers and search engines. This is without installation of services or cloud processing - intended for trialling in healthcare and industrial setting operations. You can override any keyboard shortcuts in any existing application, with any phrase that the user wishes to define.Custom gesture shape recording for hand movements with assignable speech recognition - swiping in the air, throwing over your shoulder etc.can be assigned to any keyboard pressesFacial Navigation combined simultaenously with eye-gaze tracking or nose as a joystick movement, with either speech modes or facial landmark features such as kiss/fishface, raised eyebrow, grinning and opening mouth as facial switchesPlacement of hit triggers in the air in the space around you, to trigger keyboard and joypad events, for example, placing above your head means you have to jump to activate a buttonWalking on the spot as a learnt ML model for moving in games, which you can customise for your comfort for how much distance a game character should moveSimultaneous hybrid ML model processing, for pseudo-VR in your living room. For example, you are playing your favourite first person game, by walking on the spot to move forward, using your hand in the air as the mouse cursor to move the view camera in the game, and using the hit triggers with the other hand to change directions of walking. This can even be combined with speech for calling up game menues such as "call up inventory" or "switch item". Keeping in mind that none of the existing software needs to be changed or patched to use this.The software is standalone without any further services or cloud access in its current version.
The Alternate Reality version of the Kinect for Windows
Playing all of your classic and most current games the way you wish to exercise in your gaming, at your own level of movement and comfort.
Using Surface Anywhere technology on current PCs
Surface compatible digital inking, in-air two finger multitouch and federated speech on device with spoken phrases like "click", along with live captioning.
Making software more accessible, with every new release we will continue to listen to users requests for features especially in accessibility
Enabling more users to enjoy new ways to use computers, not to be bound at their desks and to use movement as a way to bring equitable computing to more users
QUOTES and feedback
We would love to hear your feedback - good and bad - in helping us to make this software better. This project is developed with academics and students at UCL Computer Science and isn't funded (yet) so any opportunities for collaborations are most welcome! Get in touch here on this form, or below, or by email
"With technology we can empower everyone and make the impossible possible. Love the potential and possibilities that MotionInput brings to people with disabilities!”
“MotionInput improves quality of life and maintains independence and dignity by making things that matter accessible!”
International Alliance of ALS/MND Assoc.
“Solve a problem for one person completely and you will always win”.
Downloads, Videos and News
To see demos of this in action check out our Youtube playlist!
The first two applications on the Microsoft Store available are (1) In-Air MultiTouch with Speech and (2) Facial Navigation with Speech (UK/USA/Canada). Search "UCL" on the Microsoft Store.
For Global territories, please see our UCL XIP platform link.
For app instructions and user guides, see the app-specific microsite here: https://www.facenav.org
For legacy builds and historical updates, our CS homepage has prior versions.
News and Press articles
UCL Computer Science News article for Version 3 here.
Intel.com Tech For Good Story article and Video here.
What hardware do I need to run this?
-A Windows 10 based PC or laptop, with a webcam! Ideally an Intel-based PC, 7th Gen and above, and 4GB RAM. An SSD is highly recommended. The more CPU cores the merrier! Running parallel ML and CV is highly compute intensive, even with our extensive optimisations - so for example in Pseudo-VR mode for an FPS game, you may be doing hands recognition of the mouse, walking on the spot recognition, and hit targets with body landmarks, with speech, simultaneously while rendering advanced game visuals. At its simplest end, doing simple mouse clicks in a web browser, should be much less intensive.
What platforms does this run on?
-Windows 10 for the full software, and Windows 10 and 11 for the Microbuilds of the stand-alone features. We are also developing the software for Linux, Raspberry Pi, Android and Mac. If you would like to help to test this with us, please email us.
How do I run the software?
-If you are running Windows 10 or 11, you can run the microbuilds of the features without any installation processes. Download the zip file and unzip to your PC, and run the MotionInput executable file. If you want to run the full software with the GUI front-end for settings, you will need Windows 10, and both Vigembus (for In-Air Joypad) and Dotnet 3.1 for Desktops (X64) installed. Follow the instructions on the download link.
- This is very much in the realm of sci-fi! Can I do gestures like in <insert hollywood film here> ?
-Reach out to us and your suggestions, and lets see what we can do!
-What motivated you to build this?
-Covid-19 affected the world, and for a while before the vaccines, as well as the public getting sick, hospital staff were getting severely ill. To keep shared computers clean and germ free comes at a cost to various economies around the world. We saw a critical need to develop cheap/free software to help in healthcare, improve the way in which we work and so we examined many different methods for touchless computing. Along the journey, several major tech firms had made significant jumps in Machine Learning (ML) and Computer Vision, and our UCL IXN programme was well suited to getting them working together with students and academics. Some of the tech firms also had let go of past products which would have been useful if they were still in production, but the learning from them was still there. At the same time, we also realised that childhood obesity and general population health was deteriorating during lockdowns. So we developed several project packages to specifically look at how to get people moving more, with tuning of accuracy for various needs. Especially in healthcare, education and in industrial sectors, we looked at specific forms of systems inputs and patterns of human movements, to develop a robust engine that could scale with future applications of use. The Touchless Computing team at UCLCS have a key aim of equitable computing for all, without requiring further redevelopment of existing and established software products in use today.
-What is Ask-KITA?
-Ask-KITA is our speech engine, much like Alexa and Siri. It is intended for teaching, office work and clinical/industrial specific speech recognition. KITA stands for Know-IT-All. It is a combinatorial speech engine that will mix well with motion gesture technologies and gives three key levels of speech processing - commands (such as turning phrases into keyboard shortcuts and mouse events), localised and offline live caption dictation without user training, delivering recognised words into text-based programs, and gesture combined exploration of spoken phrases.
- -Whats next?
-We have a lot of great plans and the tech firm companies on board are excited to see what our academics and students will come up with next. Keep in touch with us, send us your touchless computing requests and especially if it can help people, we want to know and open collaborations.
Our UCL CS Touchless Computing Teams (March 2022, March 2023), with thanks to Intel UK.
Contact us to give us feedback, to reach out to us, for software requests and for commercial licensing terms.