INDEX
Explanations
words related to urgent matters or calls to action
New Auto-Interp
Negative Logits
endas
-0.80
rex
-0.80
seless
-0.69
expensive
-0.66
rea
-0.64
itals
-0.62
romy
-0.62
pees
-0.61
rez
-0.60
ribes
-0.60
POSITIVE LOGITS
THING
1.39
WHERE
1.17
body
1.13
where
1.09
conceivable
0.96
ONE
0.95
semblance
0.91
kind
0.89
thin
0.89
how
0.87
Activations Density 0.344%