Students will compare different NLP models and their generated protein sequence vectors (for possible empirical/statistical correlations to performance and stability metrics relevant to biopharmaceutical development). Vectors would then be organized into a structured database.
- Industry: Pharmaceuticals
- Requirements: Open to all students