INDEX
    Explanations

    mentions of "New" related to geographical locations

    New Auto-Interp
    Negative Logits
    _PTR
    -0.17
    dos
    -0.15
    Ä±ÅŁtır
    -0.14
    agas
    -0.14
    blob
    -0.14
    GIN
    -0.14
    pone
    -0.14
    ÑıÑģ
    -0.14
    ALAR
    -0.13
    velope
    -0.13
    POSITIVE LOGITS
     Haven
    0.25
    nan
    0.22
     Britain
    0.22
    ington
    0.21
     Hope
    0.21
    ark
    0.21
    alla
    0.21
     Braun
    0.20
     Mil
    0.20
     haven
    0.20
    Act Density 0.026%

    No Known Activations