In this paper we propose a mathematical learning model for the feeding behaviour of a specialist predator operating in a random environment occupied by two types of prey, palatable mimics and unpalatable models, and a generalist predator with additional alternative prey at its disposal. A well known linear reinforcement learning algorithm and its special cases are considered for updating the probabilities of the two actions, eat prey or ignore prey. Each action elicits a probabilistic response from the environment that can be favorable or unfavourable. To assess the performance of the predator a payoff function is constructed that captures the energetic benefit from consuming acceptable prey, the energetic cost from consuming unacceptable prey, and lost benefit from ignoring acceptable prey. Conditions for an improving predator payoff are also explicitly formulated.