INDEX
Explanations
verbs or phrases related to subjecting or exposing someone to something, often with negative connotations
phrases indicating subjection or external influence on individuals
New Auto-Interp
Negative Logits
case
-0.73
bra
-0.72
soon
-0.66
fox
-0.66
ppa
-0.65
yssey
-0.65
iola
-0.63
cube
-0.63
headlined
-0.63
framework
-0.62
POSITIVE LOGITS
certain
0.90
whichever
0.79
these
0.77
the
0.77
various
0.76
whatever
0.74
a
0.73
an
0.73
varying
0.72
stereotypical
0.72
Activations Density 0.231%