INDEX
    Explanations

    phrases that express expectations or beliefs about future events

    New Auto-Interp
    Negative Logits
    aption
    -0.16
     addCriterion
    -0.15
    andle
    -0.14
    ibox
    -0.14
    ellig
    -0.14
    inka
    -0.14
     hypothetical
    -0.14
    дом
    -0.14
    alim
    -0.14
    lda
    -0.14
    POSITIVE LOGITS
    ey
    0.17
    aley
    0.15
    ÃĹ↵↵
    0.14
    urge
    0.14
    گاÙĩ
    0.14
    åIJĽ
    0.14
    avra
    0.13
    SV
    0.13
    *)_
    0.13
    kel
    0.12
    Act Density 0.021%

    No Known Activations