INDEX
Negative Logits
joint
-0.09
Pek
-0.09
ener
-0.09
artisan
-0.09
Caller
-0.09
erty
-0.08
aison
-0.08
Neville
-0.08
Cousins
-0.08
uct
-0.08
POSITIVE LOGITS
person
0.21
citizen
0.18
citizens
0.15
listener
0.14
human
0.14
Person
0.13
listeners
0.13
friend
0.13
cit
0.12
daughter
0.12
Activations Density 0.079%