INDEX
    Explanations

    the presence of various forms of the verb "to be."

    New Auto-Interp
    Negative Logits
     Something
    -0.16
    something
    -0.15
    uke
    -0.15
     anything
    -0.15
    atever
    -0.14
    ä¹ĭä¸Ģ
    -0.14
     Anything
    -0.14
    nemonic
    -0.14
    bilt
    -0.14
    indle
    -0.14
    POSITIVE LOGITS
     happening
    0.31
     going
    0.25
     wrong
    0.25
    wrong
    0.21
     happ
    0.20
     done
    0.20
    /is
    0.19
     happened
    0.19
    Wrong
    0.18
     Wrong
    0.18
    Act Density 0.073%

    No Known Activations