INDEX
Explanations
mentions of specific locations or landmarks
references to specific locations or landmarks
New Auto-Interp
Negative Logits
ãģ®éŃĶ
-0.75
ijah
-0.72
erous
-0.66
Jericho
-0.65
crim
-0.65
CLASSIFIED
-0.64
aunders
-0.63
20439
-0.63
erial
-0.63
Dickens
-0.63
POSITIVE LOGITS
lich
0.98
idates
0.75
llan
0.75
hei
0.75
pport
0.75
mite
0.72
inia
0.71
erity
0.71
reements
0.70
isan
0.70
Activations Density 0.021%