Serious games for mental health is seen as the groundwork for assistive technology to maintain and improve mental health. We present a technical system layout we partly implemented for demonstration purposes and highlight vision-based perception and manipulation capabilities. These include physical interactions employing artificial general intelligence in virtual reality applications. We employ hand gesture tracking, as well as an Oculus Rift integrated gaze and eye tracking system. The resulting serious games should eventually cover daily life activities, which we additionally monitor. The dynamic and contextual modelling of obstacles are central issues, and capabilities required for serious games include knowledge about the 3D world. Such knowledge include gaze and hand sensors interpretations for multimedia information extraction in causal relationships. Towards this goal, we envision to make use of virtual reality with a physics engine (rigid and soft body dynamics including collision detection) for the observed objects. We also exploit semantic networks to enable the machine to filter information and infer ongoing complex events including hidden BDI (beliefs, desires, intentions) variables. We see this combination of employed technology as the relevant groundwork for reaching human-level general intelligence and to enable real-world applications. Future applications and user groups we target on include dementia patients.