INDEX
Explanations
phrases indicating inclusivity or collective belonging
phrases that emphasize collective experience or unity
New Auto-Interp
Negative Logits
Bots
-0.63
kers
-0.60
nect
-0.59
Nerd
-0.58
cdn
-0.57
ULTS
-0.56
Featured
-0.53
Plex
-0.53
cers
-0.52
Minority
-0.52
POSITIVE LOGITS
agher
0.70
else
0.68
upon
0.63
except
0.63
ocating
0.62
iance
0.62
-
0.61
ieu
0.60
ocation
0.59
--
0.59
Activations Density 0.124%