The large-scale manufacture of biological products results in the generation of significant quantities of process information that can be used to inform future design decisions. Currently this information is not exploited to its full potential. The challenge is thus to identify and/or develop tools that allow the utilisation of this valuable resource. The main objective of the research reported in this paper was to investigate whether it was possible to utilise information, in particular that extracted from protein sequence data, from previous processes, with the goal of informing process route selection early in development. The approach adopted draws on tools in the areas of data mining and pattern recognition including the techniques of Fisher correlation score and self-organising maps. The methodology developed was applied to two case studies utilising data from the amino acid sequences of 41 proteins previously developed at Avecia Biologics, along with associated information relating to the downstream processing steps used during their large scale manufacture. The results demonstrate that information from previous processes can be used to inform process route selection.