INDEX
    Explanations

    references to the English language and culture

    New Auto-Interp
    Negative Logits
    iku
    -0.17
    bsolute
    -0.17
    ÐľÐŀ
    -0.15
    atorium
    -0.14
    ering
    -0.14
    opal
    -0.14
     bigotry
    -0.14
    ãĤ¯ãĥĪ
    -0.14
    inker
    -0.13
    ringe
    -0.13
    POSITIVE LOGITS
    iche
    0.18
    ahl
    0.15
    avors
    0.15
    anie
    0.15
    PURE
    0.14
    _gb
    0.14
    ائ
    0.14
     Verde
    0.14
     Virgin
    0.14
    893
    0.14
    Act Density 0.059%

    No Known Activations