The availability of genomic DNA and cDNA sequence data has escalated the data mining and genomics era. We aim to investigate the length distributions of the non-coding and coding regions of protein genes of two model organisms, Arabidopsis thaliana and Drosophila melanogaster. A non-linear functional relationship model was applied and strong correlation was found between the Coding Sequence (CDS) and non-coding sequence regions, conditional on the 5' UTR data. Significant differences were found between the protein functional classes and each gene region. Examination of the non-coding and coding regions of these organisms has revealed possible correlations.