INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ib
    0.67
    se
    0.63
    uk
    0.62
    ya
    0.61
    ik
    0.57
    mis
    0.57
    ky
    0.56
    yg
    0.56
    ok
    0.55
    ked
    0.54
    POSITIVE LOGITS
    ר
    0.65
     fanciful
    0.58
     musí
    0.57
    0.57
     métodos
    0.55
     attendre
    0.55
    ास
    0.55
     octombrie
    0.55
    דו
    0.55
     време
    0.55
    Act Density 0.000%

    No Known Activations