INDEX
Explanations
mentions of locations, particularly cities like San Francisco and San Diego
New Auto-Interp
Negative Logits
ï¸ı
-0.72
numbering
-0.70
numbered
-0.68
llers
-0.68
ilater
-0.67
llor
-0.65
monop
-0.64
ICS
-0.63
pus
-0.62
hower
-0.62
POSITIVE LOGITS
Francisco
1.26
Diego
1.22
ctuary
1.13
Antonio
1.11
San
1.09
Bernardino
1.02
ibel
0.97
itary
0.97
Disk
0.97
gha
0.93
Activations Density 0.018%