INDEX
    Explanations

    historical figures and their achievements

    New Auto-Interp
    Negative Logits
    نا
    0.37
    ش
    0.35
    ности
    0.34
    0.33
    ج
    0.32
    س
    0.32
    τ
    0.30
    وح
    0.30
    as
    0.29
    ब्र
    0.29
    POSITIVE LOGITS
     be
    0.35
     stesso
    0.35
    }
    0.34
     was
    0.32
    '
    0.31
     an
    0.28
     I
    0.28
     Jr
    0.28
     Seite
    0.28
     protégé
    0.28
    Act Density 0.203%

    No Known Activations