To date, the relative contribution of the different levels of the visual hierarchy during perceptual decisions remains unclear. Typical models of visual processing, with the reverse hierarchy theory (RHT) as a prominent example, strongly emphasize the role of higher levels and interpret lower levels as sequence of simple feature detectors. Here, we investigate this issue based on two analyses. Using a novel combination of perceptual learning based on two classes of parametric faces and a subsequent odd-one-out paradigm, we first test a vital prediction of RHT: high-level pop-out. With this experimental approach, we overcome the low-level confounds of previous studies while still introducing distinct high-level representations. Contrary to previous findings, our analyses show that there is no high-level pop-out, despite very early, near-perfect classification accuracy and extensive training of our subjects. Second, we explore the underlying form of category representation during subsequent stages of perceptual training. This is accomplished by including class-external and class-internal target-distractor combinations. Whereas the subjects' responses during the first sessions are best explained instance-based and dependent on low-level metric differences, later patterns exhibit the inclusion of high-level, class-based information that is independent of target-stimulus similarity. Finally, we show that the utilized level of information is highly task-dependent.