INDEX
Explanations
references to the United States and its actions or characteristics
the word "is" used in various contexts
New Auto-Interp
Negative Logits
Telesc
-0.75
rouse
-0.71
idian
-0.66
ossom
-0.62
ees
-0.60
urry
-0.60
iates
-0.60
amount
-0.60
oso
-0.58
Afterwards
-0.57
POSITIVE LOGITS
indeed
0.95
definitely
0.95
senal
0.91
always
0.91
certainly
0.89
currently
0.87
now
0.87
largely
0.86
generally
0.85
still
0.85
Activations Density 0.662%