Ferruz et al. |

ProtGPT2: Generation

44
14
Protein Design
    More
Step 1: Upload your data

(Optional) Upload Amino Acid Starting Sequences

Drag your file(s) or upload
  • Your file can be in the following formats:txt
  • If a file is not provided, the model will generate new proteins from scratch. Please provide your protein starting sequences in a text file, with each sequence on a new line. The model will use these to generate the complete proteins.
or
Don’t have a file?
Use our demo data to run
Use Demo Data
View example data
Step 2: Set Parameters
100
100
10000
10
10
500
100
950
1000
1.200
3.000
Step 3: Complete run profile

A language model trained on the protein space that generates de novo protein sequences following the principles of natural proteins. The generated proteins display natural amino acid propensities, while disorder predictions indicate that 88% of ProtGPT2-generated proteins are globular, in line with natural sequences. ProtGPT2 is also capable of protein sequence completion.

This app can generate protein sequences of a specified length, while preserving critical functional domains. This make it ideally suited to designing smaller or more compact version of proteins, for gene therapy and other purposes.

Example use case: Designing a novel protein with specific functional properties.

Technology: GPT2

Limitations: Can generate sequences that are too similar to existing proteins or proteins that are not as stable as natural proteins.

Citation:
Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun 13, 4348 (2022). https://doi.org/10.1038/s41467-022-32007-7
Released:
Nov-01-2023
Previous Job Parameters
Your previous job parameters will show up here
so you can keep track of your jobs
Results
Parameters