INDEX
Explanations
phrases related to societal issues and demographics
phrases that indicate analysis or discussion of social groups and their characteristics
New Auto-Interp
Negative Logits
hene
-0.79
ankind
-0.69
herent
-0.67
atre
-0.67
sburgh
-0.66
vernment
-0.65
hash
-0.64
raft
-0.64
tern
-0.63
aste
-0.63
POSITIVE LOGITS
namely
1.22
albeit
1.15
whereby
1.13
although
1.10
which
1.03
including
1.03
notably
1.03
lest
1.00
wherein
0.99
suggesting
0.99
Activations Density 0.501%