INDEX
Explanations
phrases related to governmental laws and regulations
the frequent use of the word "are" in various contexts
New Auto-Interp
Negative Logits
iates
-0.69
onso
-0.68
dom
-0.67
uld
-0.67
urry
-0.67
town
-0.64
icism
-0.64
osate
-0.63
ossom
-0.62
ileaks
-0.61
POSITIVE LOGITS
senal
1.20
wolves
1.02
hereby
0.93
supposed
0.91
usually
0.88
nt
0.88
wolf
0.87
not
0.87
currently
0.86
generally
0.84
Activations Density 0.355%