INDEX
Explanations
mentions of laws or regulations being applied or discussed
the verb "are" in various contexts
New Auto-Interp
Negative Logits
urry
-0.74
onso
-0.71
oire
-0.71
iates
-0.70
ossom
-0.67
ileaks
-0.67
icism
-0.64
ortmund
-0.63
ulus
-0.62
town
-0.62
POSITIVE LOGITS
senal
1.17
wolves
1.04
hereby
0.93
wolf
0.93
supposed
0.91
not
0.86
usually
0.84
currently
0.83
generally
0.82
bound
0.81
Activations Density 0.342%