INDEX
    Explanations

    the word "all" with stronger activation for the digit '9'

    phrases that include the word "all."

    New Auto-Interp
    Negative Logits
     Provision
    -0.66
     Caption
    -0.64
     Tanz
    -0.63
    abwe
    -0.62
    aminer
    -0.62
    sofar
    -0.61
     Racer
    -0.59
    edIn
    -0.59
    bledon
    -0.58
     KH
    -0.58
    POSITIVE LOGITS
    igator
    1.30
    usion
    1.17
    usive
    1.04
    uring
    1.04
    ocating
    0.99
    igators
    0.99
    iter
    0.98
    edged
    0.97
    usions
    0.97
     encomp
    0.97
    Act Density 0.046%

    No Known Activations