INDEX
Explanations
instances where someone is expressing support or enthusiasm for a particular idea or action
expressions of personal support or commitment to causes
New Auto-Interp
Negative Logits
themselves
-0.77
allegedly
-0.67
NZ
-0.66
respectively
-0.66
Autob
-0.65
their
-0.65
ranged
-0.63
Their
-0.62
ERE
-0.58
Hebdo
-0.58
POSITIVE LOGITS
myself
1.58
my
0.96
poke
0.74
aido
0.71
uno
0.69
Patreon
0.69
76561
0.66
minist
0.64
ividual
0.63
paraph
0.63
Activations Density 0.817%