INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ě
    0.91
    .
    0.82
    нда
    0.80
     établ
    0.80
    на
    0.79
     Gebiet
    0.78
    0.76
     avanzar
    0.74
    ეფ
    0.74
     arrivée
    0.74
    POSITIVE LOGITS
    that
    1.11
    the
    1.08
     feed
    1.03
    r
    0.95
    K
    0.95
    at
    0.94
    Feed
    0.93
    feed
    0.91
    i
    0.91
    to
    0.89
    Act Density 0.006%

    No Known Activations