INDEX
    Explanations

    instances of the word "shut" in various contexts

    New Auto-Interp
    Negative Logits
    207
    -0.15
    orra
    -0.15
    auss
    -0.15
    illet
    -0.14
    iced
    -0.14
    atha
    -0.14
    /=
    -0.14
    ãĥ¼ãĥ«
    -0.14
    heets
    -0.13
    headline
    -0.13
    POSITIVE LOGITS
    ters
    0.40
    ting
    0.26
    tings
    0.24
    ty
    0.24
     tight
    0.24
     down
    0.23
    t
    0.22
    ti
    0.22
    tl
    0.21
    out
    0.21
    Act Density 0.008%

    No Known Activations