INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    t
    2.53
    ti
    2.41
    tos
    2.02
    tier
    2.02
    e
    2.02
    tie
    1.94
    es
    1.93
    ی
    1.88
    وو
    1.87
    s
    1.85
    POSITIVE LOGITS
    ف
    2.33
    2.30
    ية
    2.11
    2.11
    2.05
    2.05
    ä
    2.03
    2.02
    த்தில்
    1.99
    ena
    1.95
    Act Density 0.054%

    No Known Activations