INDEX
Explanations
action words related to actions done without permission or consent
actions that occur without consent or permission
New Auto-Interp
Negative Logits
raq
-0.81
late
-0.78
geist
-0.78
cow
-0.73
iers
-0.72
ahime
-0.70
=-=-=-=-=-=-=-=-
-0.69
atile
-0.67
mun
-0.67
lish
-0.66
POSITIVE LOGITS
regard
1.02
necessarily
0.97
sacrificing
0.97
compromising
0.94
mentioning
0.93
knowing
0.92
hesitation
0.88
disclosing
0.87
exception
0.86
risking
0.85
Activations Density 0.042%