TransPHLA-AOMP:
A transformer-based model to predict peptide-HLA class I binding and optimize mutated peptides for vaccine design

The code and data

Dataset: https://github.com/a96123155/TransPHLA-AOMP/tree/master/Dataset
◆ The training set, independent test set, and external test set used by TransPHLA-AOMP
◆ All samples correctly predicted by TransPHLA-AOMP in the above data set
Code:https://github.com/a96123155/TransPHLA-AOMP

Common HLA sequence

112 common HLAs and their sequences

The attention scores

➤ For different HLAs with different peptide lengths: 20 amino acids at different peptide positions for binding or non-binding.

➤ Format: *.npy

➤ Describe: Each file has three parts, including all samples, positive samples (binding data), negative samples (non-binding data).

➤ Suggested reading method: Numpy package in Python, code show as below:
# label = ‘all’
# label = ‘positive’
# label = ‘negative’
aatype_position = np.load('{}_Length{}.npy'.format(hla, length), allow_pickle = True).item()[label]

➤ Each file can draw the following corresponding heatmap.tif, the draw code can be found in draw_pHLA_attns function in https://github.com/a96123155/TransPHLA-AOMP/blob/master/TransPHLA-AOMP/attention.py

The heatmaps

➤ For different HLAs with different peptide lengths: 20 amino acids at different peptide positions for binding or non-binding.

➤ Format: *.tif

➤ DPI: 600