INDEX
Explanations
references to neighborhoods, neighbors, and community settings
mentions of neighborhoods and related community terms
New Auto-Interp
Negative Logits
bearer
-0.78
udeb
-0.74
OPLE
-0.72
othal
-0.70
ISM
-0.69
ocobo
-0.66
"]=>
-0.66
REDACTED
-0.66
ONT
-0.66
lain
-0.66
POSITIVE LOGITS
bors
1.47
Neigh
1.27
Neigh
1.14
bour
1.10
neighbour
1.08
neighbors
1.04
neighbours
1.03
neighbor
0.99
bor
0.94
neighb
0.92
Activations Density 0.006%