INDEX
Negative Logits
joint
-0.08
take
-0.08
follic
-0.08
psych
-0.08
orang
-0.08
investigative
-0.08
staand
-0.08
outlets
-0.07
verm
-0.07
记
-0.07
POSITIVE LOGITS
Better
0.09
Poor
0.08
Laut
0.07
Find
0.07
Mauro
0.07
ery
0.07
Carol
0.07
Laz
0.07
Called
0.07
Mare
0.07
Activations Density 0.003%