The additive SMILES-based optimal descriptors have been used for modelling the bee toxicity. The influence of relative prevalence of the SMILES attributes in a training and test sets to the models for bee toxicity has been analysed. Avoiding the use of rare attributes improves statistical characteristics of the model on the external test set. The possibility of using the probability of the presence of SMILES attributes in training and test sets for rational definition of the applicability domain is discussed.