Due to the improvements in computer technology, interaction between computer and human has been evolved from command-line based interfaces to natural user interfaces that enables interaction in a more human-human way such as by speech, hand and body gestures, facial expressions and eye gaze. In this study controlling three dimensional images with gestures and speech using a three dimensional depth camera is realized in order to ensure human computer interaction in a more natural way. For this purpose realized system allows starting and closing the application and interaction with the three dimensional images using only speech and gestures but not using keyboard and mouse. System allows three different speech commands to start, close the application and reset the three dimensional image. Furthermore gesture-based commands are used to rotate, pan and zoom the three dimensional image.