INDEX
Negative Logits
gable
0.43
waste
0.43
simply
0.39
HLA
0.38
explore
0.37
nable
0.37
prüfung
0.37
་་
0.37
explore
0.36
leading
0.36
POSITIVE LOGITS
certain
0.58
Iraqi
0.55
Iraq
0.55
某些
0.53
Sudan
0.51
например
0.49
Saddam
0.49
notorious
0.48
tomatoes
0.48
Kashmir
0.47
Activations Density 0.060%