INDEX
Explanations
instances of people pleading guilty in legal contexts
New Auto-Interp
Negative Logits
ijke
-0.17
_OM
-0.15
urette
-0.15
/preferences
-0.14
esson
-0.14
annonce
-0.14
abase
-0.14
iverse
-0.14
alian
-0.14
ola
-0.14
POSITIVE LOGITS
guilty
0.43
gu
0.23
Gu
0.23
guilt
0.21
Ple
0.20
plea
0.20
Gu
0.20
plead
0.19
innocent
0.19
pleaded
0.19
Activations Density 0.012%