INDEX
Explanations
instances of the word "urban" or variations thereof
references to urban areas or related geographical terms
New Auto-Interp
Negative Logits
xon
-0.86
aurus
-0.80
creen
-0.80
prus
-0.72
rehensive
-0.67
ertodd
-0.67
lder
-0.66
ppo
-0.66
CHR
-0.65
paio
-0.64
POSITIVE LOGITS
urb
1.01
ulent
0.90
ulence
0.89
abies
0.86
inson
0.86
inator
0.81
inators
0.80
icz
0.75
yrinth
0.74
orph
0.74
Activations Density 0.022%