INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ب
    1.31
    بك
    1.11
    on
    0.99
    in
    0.98
    ف
    0.95
    ת
    0.95
    ع
    0.93
    0.91
    to
    0.90
    ри
    0.89
    POSITIVE LOGITS
     וכ
    0.83
    0.79
     incluidos
    0.77
    équipe
    0.75
     siguiendo
    0.74
     titulado
    0.74
     tenido
    0.73
    immagine
    0.73
     adequada
    0.73
     travaux
    0.73
    Act Density 0.015%

    No Known Activations