INDEX
Explanations
statements advocating for political or social change
New Auto-Interp
Negative Logits
ocracy
-0.68
ILCS
-0.64
ioxide
-0.62
Camel
-0.61
lua
-0.60
#$
-0.60
Sheen
-0.60
RO
-0.59
alion
-0.58
RF
-0.58
POSITIVE LOGITS
etheless
0.92
editions
0.91
generations
0.89
incarn
0.88
eras
0.87
itiz
0.83
versions
0.81
iterations
0.78
ebin
0.78
cies
0.76
Activations Density 0.379%