INDEX
    Explanations

    phrases related to disappointment and dissatisfaction

    New Auto-Interp
    Negative Logits
    ena
    -0.15
    -0.14
    ::↵
    -0.13
     thanks
    -0.13
    kop
    -0.13
    :↵
    -0.13
     Indeed
    -0.13
    ifr
    -0.12
     below
    -0.12
    ëį°ìĿ´íĬ¸
    -0.12
    POSITIVE LOGITS
     [
    0.24
    [s
    0.23
     [$
    0.23
    [,]
    0.18
    [in
    0.18
     ['
    0.17
    [to
    0.17
     [<
    0.17
     [_
    0.17
     [`
    0.17
    Act Density 2.423%

    No Known Activations