INDEX
Negative Logits
ä
1.09
än
0.83
modellen
0.80
It
0.79
ain
0.78
terug
0.78
kiezen
0.77
verwend
0.74
Protestants
0.73
There
0.72
POSITIVE LOGITS
reputation
1.15
reputations
1.13
리
0.91
도
0.91
اد
0.87
Reputation
0.87
에
0.87
스
0.87
ahari
0.82
फोल
0.79
Activations Density 0.028%