INDEX
    Explanations

    possessive and individualizing words

    New Auto-Interp
    Negative Logits
    tte
    2.11
    }
    1.88
    tion
    1.87
    le
    1.84
    ا
    1.75
    }])
    1.66
    ра
    1.58
    al
    1.55
    1.54
    tus
    1.52
    POSITIVE LOGITS
    Те
    2.14
    Д
    2.00
    बी
    1.94
    1.86
    ले
    1.77
    تی
    1.74
    И
    1.73
    Л
    1.73
    Га
    1.71
    То
    1.67
    Act Density 0.081%

    No Known Activations