The code and data
Dataset: https://github.com/a96123155/TransPHLA-AOMP/tree/master/Dataset
◆ The training set, independent test set, and external test set used by TransPHLA-AOMP
◆ All samples correctly predicted by TransPHLA-AOMP in the above data set
Code:https://github.com/a96123155/TransPHLA-AOMP
Common HLA sequence
112 common HLAs and their sequences
The attention scores
➤ For different HLAs with different peptide lengths: 20 amino acids at different peptide positions for binding or non-binding.
➤ Format: *.npy
➤ Describe: Each file has three parts, including all samples, positive samples (binding data), negative samples (non-binding data).
➤ Suggested reading method: Numpy package in Python, code show as below:
# label = ‘all’
# label = ‘positive’
# label = ‘negative’
aatype_position = np.load('{}_Length{}.npy'.format(hla, length), allow_pickle = True).item()[label]
➤ Each file can draw the following corresponding heatmap.tif, the draw code can be found in draw_pHLA_attns function in https://github.com/a96123155/TransPHLA-AOMP/blob/master/TransPHLA-AOMP/attention.py
The heatmaps
➤ For different HLAs with different peptide lengths: 20 amino acids at different peptide positions for binding or non-binding.
➤ Format: *.tif
➤ DPI: 600