INDEX
    Explanations

    expressions of curiosity or contemplation

    New Auto-Interp
    Negative Logits
    ucci
    -0.17
    ukan
    -0.14
    roke
    -0.14
    indle
    -0.14
    uka
    -0.14
    serter
    -0.14
    olland
    -0.14
    uth
    -0.14
    NotBlank
    -0.14
    é¨ĵ
    -0.13
    POSITIVE LOGITS
     whether
    0.19
     WHETHER
    0.17
    whether
    0.17
    quete
    0.16
    æĺ¯åIJ¦
    0.16
    alta
    0.16
    atti
    0.15
    ogl
    0.15
    ůr
    0.15
    ictory
    0.15
    Act Density 0.012%

    No Known Activations