INDEX
Negative Logits
논
0.54
논
0.52
logically
0.50
들
0.47
pot
0.44
언
0.44
nj
0.43
論
0.41
Pol
0.40
experiments
0.40
POSITIVE LOGITS
قائ
0.43
েইল
0.40
압
0.39
వేశ
0.39
hairy
0.39
außerdem
0.39
yeah
0.38
жаем
0.38
wah
0.38
虏
0.38
Activations Density 0.002%