INDEX
Explanations
words related to urban settings or locations
references to streets or street-related contexts
New Auto-Interp
Negative Logits
emort
-0.90
merce
-0.88
essee
-0.81
igham
-0.80
opsis
-0.77
ividual
-0.73
abase
-0.72
uania
-0.72
xit
-0.72
racuse
-0.71
POSITIVE LOGITS
cars
1.08
lights
0.99
fighter
0.89
wear
0.88
ways
0.86
fare
0.83
light
0.82
car
0.80
corners
0.80
walk
0.79
Activations Density 0.040%