INDEX
Explanations
words related to accusations and legal guilt
instances of the word "guilty" in various contexts
New Auto-Interp
Negative Logits
andel
-0.79
edia
-0.76
Gork
-0.72
pora
-0.70
flies
-0.70
ILA
-0.69
pid
-0.67
lav
-0.66
grad
-0.65
yip
-0.65
POSITIVE LOGITS
plea
0.82
verdict
0.80
isance
0.79
unfocusedRange
0.79
alty
0.79
pleas
0.78
guilty
0.77
iciary
0.77
Guilty
0.76
thouse
0.75
Activations Density 0.008%