INDEX
    Explanations

    references to geographical regions, specifically the West and its alternatives

    New Auto-Interp
    Negative Logits
    odore
    -0.18
    ANCE
    -0.17
    ance
    -0.16
    mouseleave
    -0.15
    ity
    -0.15
    attach
    -0.15
    itemap
    -0.14
    ottes
    -0.14
    stown
    -0.14
    xec
    -0.14
    POSITIVE LOGITS
    ward
    0.40
    ern
    0.32
    ERN
    0.31
    erner
    0.31
     Indies
    0.30
    bound
    0.29
    s
    0.29
     ern
    0.28
    wards
    0.27
    à¹Ģà¸ī
    0.26
    Act Density 0.050%

    No Known Activations