INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.06
    ی
    0.79
    с
    0.77
    reszt
    0.70
    0.70
    ř
    0.69
    exp
    0.68
    v
    0.68
    0.68
    س
    0.68
    POSITIVE LOGITS
    is
    0.94
     bactéri
    0.86
    та
    0.85
    0.85
     acaba
    0.85
    عرف
    0.82
     очки
    0.75
    ם
    0.75
     averse
    0.74
     lathes
    0.73
    Act Density 3.539%

    No Known Activations