INDEX
Explanations
locations or places
phrases indicating physical locations
New Auto-Interp
Negative Logits
ework
-0.74
sequence
-0.73
rendered
-0.71
spir
-0.69
ned
-0.69
vous
-0.68
ese
-0.67
eries
-0.66
airs
-0.66
doms
-0.65
POSITIVE LOGITS
uate
1.00
atop
0.97
geographically
0.96
near
0.95
smack
0.94
somewhere
0.92
centrally
0.89
squarely
0.86
downtown
0.85
beneath
0.84
Activations Density 0.061%