INDEX
Negative Logits
elta
-0.82
ieved
-0.74
epend
-0.74
vez
-0.73
enfranch
-0.72
pez
-0.70
opter
-0.70
reath
-0.69
rients
-0.67
emis
-0.67
POSITIVE LOGITS
anyone
0.79
you
0.79
anybody
0.75
somebody
0.69
someone
0.68
ever
0.67
guests
0.67
someday
0.67
they
0.64
objections
0.64
Activations Density 0.059%