INDEX
    Explanations

    instances of the word "but" and variations related to it

    New Auto-Interp
    Negative Logits
    uppe
    -0.17
    ops
    -0.16
     therefore
    -0.15
    ilee
    -0.15
    das
    -0.15
    §
    -0.15
    olly
    -0.15
    bite
    -0.15
     susp
    -0.15
    olarity
    -0.14
    POSITIVE LOGITS
    chers
    0.24
    åĩ¡
    0.24
    ler
    0.24
    term
    0.24
    lers
    0.23
    ts
    0.21
    ÑģÑıÑĤ
    0.21
    rint
    0.20
    ressing
    0.20
    ters
    0.19
    Act Density 0.053%

    No Known Activations