INDEX
Explanations
expressions indicating personal opinions or self-descriptions
New Auto-Interp
Negative Logits
contextLoads
-0.91
<?
-0.79
UpInside
-0.77
Offisielt
-0.76
berdayakan
-0.75
featureID
-0.71
NUMX
-0.70
(:,:,
-0.69
GIVEREF
-0.69
存于互联网档案馆
-0.66
POSITIVE LOGITS
Labor
0.46
soal
0.45
Flo
0.45
Labor
0.42
content
0.42
labor
0.41
jaan
0.40
cal
0.40
tem
0.40
ын
0.40
Activations Density 0.001%