INDEX
    Explanations

    adverbs and words indicating recent or ongoing conditions and actions

    New Auto-Interp
    Negative Logits
    inci
    -0.16
    .nlm
    -0.15
    779
    -0.14
    fty
    -0.14
    ubi
    -0.14
    stone
    -0.14
    Ñĥнк
    -0.13
    isan
    -0.13
    olation
    -0.13
    گراÙĨ
    -0.13
    POSITIVE LOGITS
    lamaz
    0.16
     whose
    0.15
     worth
    0.15
    EEDED
    0.14
    ifestyles
    0.14
    лага
    0.14
     which
    0.14
    езда
    0.14
     cui
    0.14
    ãģ¹
    0.13
    Act Density 0.230%

    No Known Activations