INDEX
Explanations
phrases expressing skepticism or criticism regarding societal structures and justice systems
New Auto-Interp
Negative Logits
indrome
-0.06
avenport
-0.06
tomorrow
-0.06
imitives
-0.05
compare
-0.05
hopefully
-0.05
emoc
-0.05
Wonder
-0.05
isco
-0.05
should
-0.05
POSITIVE LOGITS
actual
0.19
Actual
0.17
Actual
0.17
å®ŀéĻħ
0.17
actual
0.16
actually
0.15
reality
0.15
_actual
0.14
actually
0.14
practical
0.13
Activations Density 0.042%