INDEX
    Explanations

    contractions and informal language

    New Auto-Interp
    Negative Logits
    i
    0.92
    تان
    0.83
    dav
    0.82
    سازی
    0.80
    ی
    0.80
    ة
    0.79
    ടുത
    0.79
    0.78
    talk
    0.78
    াদা
    0.78
    POSITIVE LOGITS
    '."
    1.23
    1.21
    '"
    1.18
    ',
    1.17
    ':
    1.14
    '-
    1.14
    '.
    1.13
     lotta
    1.13
    ',"
    1.12
    '+
    1.11
    Act Density 0.103%

    No Known Activations