INDEX
Explanations
references to legal charges and offenses
New Auto-Interp
Negative Logits
impunity
-0.17
nze
-0.15
šov
-0.14
nore
-0.13
ouce
-0.13
ÙĨاÙħÙĩ
-0.13
kas
-0.13
apologies
-0.13
attern
-0.12
šet
-0.12
POSITIVE LOGITS
charges
0.91
charge
0.79
Charges
0.72
charges
0.68
Charge
0.66
charge
0.63
charged
0.63
Charge
0.59
charg
0.52
charging
0.51
Activations Density 0.088%