INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    gantung
    -0.94
     deti
    -0.93
     besta
    -0.90
     حسين
    -0.88
    hth
    -0.86
    venez
    -0.86
    LastError
    -0.82
     Achter
    -0.82
    ouro
    -0.82
     positiv
    -0.82
    POSITIVE LOGITS
     should
    1.05
     on
    0.89
     unlikely
    0.87
    throw
    0.85
     because
    0.84
    Ύ
    0.81
     cannot
    0.79
     contribute
    0.77
     chứa
    0.77
     ignore
    0.77
    Act Density 0.003%

    No Known Activations