INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ب
    1.62
    ر
    1.58
    री
    1.37
    م
    1.27
    مراجع
    1.26
    1.25
    ف
    1.25
    ي
    1.24
     тех
    1.23
    1.21
    POSITIVE LOGITS
    ated
    1.35
    ured
    1.24
    ually
    1.23
    ien
    1.23
    il
    1.19
    ot
    1.19
    ile
    1.18
    ata
    1.18
    ina
    1.18
    OD
    1.16
    Act Density 0.001%

    No Known Activations