Skip to main content
Fig. 1 | BMC Biology

Fig. 1

From: PBertKla: a protein large language model for predicting human lysine lactylation sites

Fig. 1

The determination of optimal sample length and sequence similarity threshold for defining human Kla benchmark datasets. A The line chart showing each AUC value of each dataset with specific sample length and sequence similarity, in which x axis is sample length, different colored lines correspond to different sequence similarity thresholds. B–C Two violin plots visualizing the AUC values of datasets generated based on different sequence similarity thresholds of CD-HIT (corresponding to different colors) and different sample lengths (corresponding to different colors), respectively. D Sequence characteristics of Kla and non-Kla sites in the training data

Back to article page