INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    b
    1.03
    st
    0.98
    s
    0.94
    c
    0.89
    of
    0.86
    d
    0.83
    ur
    0.81
    0.79
    am
    0.78
    ol
    0.77
    POSITIVE LOGITS
     inhaling
    0.82
    ва
    0.80
     nul
    0.80
     inhaled
    0.80
    на
    0.78
     arbre
    0.77
     enfin
    0.77
     inhal
    0.77
     jeopard
    0.76
     inhale
    0.76
    Act Density 0.003%

    No Known Activations