INDEX
Explanations
proper nouns related to news or media headlines
instances of the word "NEW" indicating a possible focus on new information or updates
New Auto-Interp
Negative Logits
stood
-0.82
ppe
-0.81
76561
-0.68
cephal
-0.65
mop
-0.64
agn
-0.63
uca
-0.63
minecraft
-0.63
McGee
-0.62
osate
-0.62
POSITIVE LOGITS
YORK
1.39
foundland
1.13
PORT
1.10
ARK
0.96
CAST
0.93
Orleans
0.89
ITY
0.88
bie
0.88
TON
0.86
MAN
0.84
Activations Density 0.008%