INDEX
Explanations
locations or places
phrases indicating physical locations
New Auto-Interp
Negative Logits
ework
-0.84
eries
-0.78
vous
-0.77
haw
-0.72
airs
-0.72
rendered
-0.70
eden
-0.69
nature
-0.68
ned
-0.68
ese
-0.67
POSITIVE LOGITS
uate
0.95
downtown
0.87
atop
0.85
centrally
0.85
near
0.85
smack
0.84
geographically
0.84
somewhere
0.83
therein
0.83
inside
0.80
Activations Density 0.048%