INDEX
Explanations
descriptions of individuals using specific terms to define themselves
phrases describing self-identification or self-description labels
New Auto-Interp
Negative Logits
isons
-0.79
ubi
-0.73
anqu
-0.71
adata
-0.71
azaki
-0.70
anches
-0.67
akings
-0.66
itations
-0.64
vibrations
-0.63
elight
-0.62
POSITIVE LOGITS
sty
0.90
member
0.81
believer
0.80
adherent
0.80
newcomer
0.77
legislator
0.76
former
0.75
resident
0.73
founder
0.71
gunman
0.71
Activations Density 0.082%