INDEX
Negative Logits
ordinarily
0.45
invis
0.43
Citizens
0.39
immensely
0.39
otrex
0.38
,
0.38
getDefault
0.38
人类
0.38
Citizens
0.37
ствен
0.36
POSITIVE LOGITS
f
0.52
ג
0.52
serie
0.49
evening
0.49
ంత్ర
0.48
หาร
0.46
history
0.45
geg
0.45
따른
0.45
analisi
0.45
Activations Density 0.004%