INDEX
Negative Logits
steroids
-0.08
hair
-0.08
.angle
-0.08
Ey
-0.07
Assets
-0.07
ాగే
-0.07
आ
-0.07
poda
-0.07
gastric
-0.07
yog
-0.07
POSITIVE LOGITS
attention
0.09
Reform
0.08
abuses
0.08
welfare
0.08
interne
0.08
ixin
0.08
timeval
0.08
Lucy
0.08
Welfare
0.08
obedience
0.08
Activations Density 0.003%