INDEX
    Explanations

    phrases related to confusion or lack of understanding

    repeated references to the concept of "going on," indicating a search for clarity or understanding in a situation

    New Auto-Interp
    Negative Logits
    gart
    -0.72
    vale
    -0.65
    ament
    -0.59
    inates
    -0.57
    stra
    -0.54
    adjusted
    -0.54
    aments
    -0.52
     Desk
    -0.52
     rede
    -0.52
     relinqu
    -0.51
    POSITIVE LOGITS
     wrong
    0.91
     happening
    0.80
     viral
    0.75
     Wrong
    0.71
     happen
    0.70
    wrong
    0.70
    ãĥ£
    0.69
    ggle
    0.69
    lems
    0.69
    æĸ¹
    0.68
    Act Density 0.031%

    No Known Activations