INDEX
    Explanations

    specific nouns and their associated actions or statuses in various contexts

    New Auto-Interp
    Negative Logits
    Äįel
    -0.17
     stains
    -0.15
    ie
    -0.15
    ies
    -0.15
    iral
    -0.14
    gain
    -0.14
    _references
    -0.14
     Wed
    -0.14
    èĨ
    -0.14
    Äįer
    -0.14
    POSITIVE LOGITS
    ardy
    0.17
    rire
    0.17
    rale
    0.17
    ifestyles
    0.15
     Gilbert
    0.14
    aroo
    0.14
    682
    0.14
    ury
    0.14
     Fury
    0.14
    errupted
    0.14
    Act Density 0.049%

    No Known Activations