INDEX
Explanations
instances of actions performed without consent
phrases indicating actions taken without permission or consent
New Auto-Interp
Negative Logits
lyak
-0.74
=-=-=-=-=-=-=-=-
-0.69
rez
-0.67
ãģ®éŃĶ
-0.67
Reincarnated
-0.67
atile
-0.66
dominated
-0.65
raq
-0.65
hov
-0.65
esa
-0.65
POSITIVE LOGITS
authorization
1.25
permission
1.18
informing
1.10
consent
1.09
specifying
1.04
explanation
1.02
reperc
1.01
disclosing
1.01
warning
1.00
supervision
0.99
Activations Density 0.082%