INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     puisque
    0.50
    anden
    0.45
    ulle
    0.44
    whatever
    0.44
    Suitable
    0.44
    ustainability
    0.43
    punkt
    0.43
    annya
    0.43
     whatever
    0.42
    ensatz
    0.42
    POSITIVE LOGITS
     بعض
    1.06
     некоторые
    1.05
     algunos
    1.02
     برخی
    1.02
     некоторых
    0.99
     alguns
    0.99
     sommige
    0.96
     някои
    0.94
     some
    0.90
     ചില
    0.88
    Act Density 0.176%

    No Known Activations