INDEX
    Explanations

    numbers and Arabic script

    New Auto-Interp
    Negative Logits
    claimed
    0.52
    aconda
    0.52
    akura
    0.52
    0.51
     I
    0.50
    0.49
    akse
    0.48
    G
    0.48
     встреча
    0.47
     сам
    0.47
    POSITIVE LOGITS
    تين
    0.57
    z
    0.56
    زي
    0.54
    ي
    0.54
     देऊन
    0.53
    dır
    0.53
     grau
    0.53
     darle
    0.51
     doua
    0.51
     جهت
    0.50
    Act Density 0.161%

    No Known Activations