INDEX
    Explanations

    references to specific locations or demographics

    New Auto-Interp
    Negative Logits
    imin
    -0.16
    on
    -0.15
    outube
    -0.14
    æĹ
    -0.14
    UEST
    -0.14
    139
    -0.14
     uncon
    -0.13
    ļ
    -0.13
    ourn
    -0.13
    azor
    -0.13
    POSITIVE LOGITS
    گراÙĨ
    0.17
    iyon
    0.15
    acades
    0.15
    vet
    0.14
    LIBINT
    0.14
     metav
    0.14
    struk
    0.14
    antry
    0.14
    pheres
    0.14
    avern
    0.14
    Act Density 0.070%

    No Known Activations