INDEX
Explanations
phrases indicating a speaker's role or identity.
statements related to societal issues, activism, and calls for unity and action
New Auto-Interp
Negative Logits
»Ĵ
-0.73
OND
-0.65
ONSORED
-0.63
Stars
-0.63
TERN
-0.62
eruption
-0.60
stage
-0.59
usterity
-0.59
undown
-0.59
usters
-0.58
POSITIVE LOGITS
myself
0.75
huh
0.71
however
0.70
we
0.68
please
0.66
naturally
0.64
she
0.64
moreover
0.64
sometimes
0.63
beware
0.63
Activations Density 0.112%