INDEX
Explanations
phrases related to social issues and criticisms
references to collective ownership or shared experiences
New Auto-Interp
Negative Logits
wrap
-0.76
wic
-0.74
puff
-0.74
bender
-0.74
ouse
-0.72
yang
-0.72
Levine
-0.72
AE
-0.70
icted
-0.68
zzle
-0.68
POSITIVE LOGITS
selves
1.30
ancestors
1.19
nation
1.18
own
1.17
shores
1.07
selves
1.05
beloved
1.05
collective
1.04
country
1.02
society
0.97
Activations Density 0.120%