INDEX
Negative Logits
orso
0.43
CS
0.42
CS
0.40
Sensor
0.40
assignment
0.39
assignment
0.39
CSF
0.39
センサー
0.39
shocking
0.38
coleg
0.38
POSITIVE LOGITS
Affairs
0.42
fnt
0.42
구해
0.40
trav
0.37
autof
0.37
“.
0.36
Всім
0.36
給大家
0.35
ванне
0.35
пе
0.35
Activations Density 0.000%