INDEX
Explanations
words related to criminal activities and legal actions
occurrences of the word "allegedly."
New Auto-Interp
Negative Logits
ERAL
-0.76
lite
-0.71
bern
-0.68
anim
-0.68
ilation
-0.66
egg
-0.66
Ale
-0.65
fork
-0.63
helm
-0.63
ament
-0.62
POSITIVE LOGITS
violated
0.77
implicated
0.76
plotted
0.76
involved
0.74
infringing
0.74
infring
0.73
accuse
0.71
originated
0.71
obtained
0.71
interfered
0.70
Activations Density 0.013%