We propose a deep-learning-based approach to producing computer-generated holograms (CGHs) of real-world scenes. We design an end-to-end convolutional neural network (the Stereo-to-Hologram Network, SHNet) framework that takes a stereo image pair as input and efficiently synthesizes a monochromatic 3D complex hologram as output. The network is able to rapidly and straightforwardly calculate CGHs from the directly recorded images of real-world scenes, eliminating the need for time-consuming intermediate depth recovery and diffraction-based computations. We demonstrate the 3D reconstructions with clear depth cues obtained from the SHNet-based CGHs by both numerical simulations and optical holographic virtual reality display experiments.