INDEX
Explanations
references to the city of San Francisco
mentions of San Francisco (SF)
New Auto-Interp
Negative Logits
lain
-0.84
stall
-0.83
taker
-0.73
pins
-0.71
berus
-0.71
hift
-0.69
abase
-0.69
pin
-0.68
Kant
-0.67
iasis
-0.67
POSITIVE LOGITS
PD
1.00
Chronicle
0.98
WA
0.97
SF
0.94
ML
0.92
ORE
0.91
DC
0.91
ORTS
0.90
WD
0.89
DF
0.88
Activations Density 0.010%