INDEX
Negative Logits
entrap
0.42
assel
0.40
Giveen
0.36
Casey
0.36
nutr
0.36
assault
0.35
trapping
0.35
setDefault
0.35
improvisation
0.35
participate
0.35
POSITIVE LOGITS
том
0.43
POM
0.42
vär
0.42
POT
0.41
ติก
0.40
ürn
0.40
Vor
0.40
을
0.40
TRE
0.40
bure
0.40
Activations Density 0.001%