INDEX
Explanations
neighborhood-related terms or locations
mentions of neighborhoods or related terms
New Auto-Interp
Negative Logits
bearer
-0.80
REDACTED
-0.72
udeb
-0.70
Mehran
-0.68
ATA
-0.67
ISM
-0.65
utenberg
-0.64
))))
-0.63
xxxxxxxx
-0.60
ista
-0.60
POSITIVE LOGITS
bors
1.44
bour
1.12
bor
1.11
Neigh
0.95
Neigh
0.95
eties
0.91
stairs
0.91
«
0.90
neighb
0.89
neighbour
0.88
Activations Density 0.007%