INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    A
    0.77
    p
    0.76
    1
    0.75
     apostila
    0.64
    Э
    0.63
    ina
    0.61
    nya
    0.61
     همان
    0.61
    Golf
    0.60
    6
    0.60
    POSITIVE LOGITS
    ی
    0.78
    ي
    0.71
    ο
    0.70
    اف
    0.68
    ب
    0.65
    0.64
     MIF
    0.64
    െല്ലാം
    0.64
     pove
    0.64
    ownicy
    0.63
    Act Density 0.001%

    No Known Activations