INDEX
Negative Logits
collega
-0.09
nee
-0.08
Probe
-0.08
atre
-0.08
quire
-0.08
probe
-0.08
probe
-0.08
Dolly
-0.08
probes
-0.07
zinha
-0.07
POSITIVE LOGITS
/**/*
0.08
Fear
0.08
Cong
0.08
polov
0.08
hlad
0.08
halves
0.08
نقص
0.07
distrust
0.07
conjunction
0.07
consistently
0.07
Activations Density 0.008%