INDEX
    Explanations

    lunch followed by parenthesis

    New Auto-Interp
    Negative Logits
    t
    0.91
    )
    0.70
    With
    0.64
    AB
    0.63
    d
    0.63
    _
    0.63
    llä
    0.61
     an
    0.61
    ן
    0.61
    User
    0.60
    POSITIVE LOGITS
    ра
    1.04
    at
    1.02
    ла
    0.90
    ме
    0.83
     lunches
    0.81
     lunch
    0.80
    ور
    0.79
    не
    0.78
     Dinner
    0.77
    ний
    0.76
    Act Density 0.012%

    No Known Activations