INDEX
    Explanations

    neighborhood-related terms

    references to neighborhoods and community contexts

    New Auto-Interp
    Negative Logits
     bearer
    -0.75
    isted
    -0.73
     Mehran
    -0.67
    ista
    -0.66
    REDACTED
    -0.64
     whip
    -0.63
    arian
    -0.63
    ISM
    -0.63
    displayText
    -0.60
    istic
    -0.60
    POSITIVE LOGITS
    bors
    1.44
     Neigh
    1.34
    Neigh
    1.31
    bour
    1.18
     neighbours
    1.04
     neighbour
    1.03
     neighb
    1.03
     neighbors
    1.01
    bor
    0.96
    stairs
    0.91
    Act Density 0.014%

    No Known Activations