INDEX
Explanations
phrases related to denial or rejection
instances of denial or refusal in various contexts
New Auto-Interp
Negative Logits
oiler
-0.86
ARCH
-0.74
enegger
-0.73
Ec
-0.71
âĸĪâĸĪâĸĪâĸĪâĸĪâĸĪâĸĪâĸĪ
-0.70
INGTON
-0.68
andals
-0.67
è¦ļéĨĴ
-0.67
rim
-0.66
arnaev
-0.66
POSITIVE LOGITS
access
0.83
ļéĨĴ
0.82
afe
0.79
gratification
0.78
entry
0.77
parole
0.74
lement
0.71
refunds
0.69
permission
0.68
ilege
0.67
Activations Density 0.061%