INDEX
Explanations
phrases indicating concern or critique
phrases indicative of ethical concerns and debates
New Auto-Interp
Negative Logits
Flo
-0.62
unlucky
-0.62
\<
-0.62
ById
-0.59
devices
-0.59
nin
-0.58
ILCS
-0.57
nown
-0.57
holes
-0.57
hops
-0.57
POSITIVE LOGITS
overturn
0.67
regulating
0.64
democracy
0.62
osate
0.58
governance
0.58
determining
0.58
preserving
0.57
protecting
0.57
upholding
0.56
procedural
0.56
Activations Density 1.598%