INDEX
Explanations
terms associated with types of social dynamics and political labels
New Auto-Interp
Negative Logits
ths
-0.75
�
-0.69
saf
-0.69
utions
-0.67
openings
-0.66
hens
-0.66
wise
-0.60
ashes
-0.59
ther
-0.59
marriage
-0.59
POSITIVE LOGITS
besides
0.84
perched
0.72
emis
0.72
along
0.71
clicked
0.69
showcasing
0.69
below
0.69
believed
0.69
herein
0.68
nailed
0.68
Activations Density 1.277%