INDEX
Explanations
phrases related to actions or decisions made by a group of people
references to groups of people or entities in a plural form
New Auto-Interp
Negative Logits
open
-0.67
arm
-0.66
golf
-0.65
ice
-0.64
high
-0.62
convention
-0.62
level
-0.62
cap
-0.61
tail
-0.61
bow
-0.60
POSITIVE LOGITS
they
3.25
their
2.24
them
1.98
these
1.82
there
1.76
you
1.66
those
1.65
she
1.49
we
1.45
They
1.45
Activations Density 0.007%