2021.09.30 NOTE: we are in the process of recomputing the following results and will update the page shortly

The following are the baseline methods we evaluated

  • Levenshtein: Levenshtein distance to wild-type.

  • BLOSUM62: BLOSUM62-score relative to wild-type.

  • Ridge regression: Ridge regression model on one-hot encoding.

  • Convolutional network: Simple convolutional network on one-hot encoding.

  • ESM-untrained: 750M parameter transformer with randomly-initialized weights

  • ESM-1b: 750M parameter transformer pretrained on UniRef50.

  • ESM-1v: 750M parameter transformer pretrained on UniRef90. Only one element of ensemble used due to compute constraints.


Supplemental baselines


