INDEX
    Explanations

    words ending in ing, ed, or suffixes

    New Auto-Interp
    Negative Logits
     so
    0.71
    0.68
    /
    0.63
     re
    0.59
     -
    0.59
     الش
    0.57
     s
    0.57
     Tal
    0.57
     scratch
    0.56
    ミア
    0.55
    POSITIVE LOGITS
    1.59
    ing
    1.57
    ed
    1.55
    sembled
    1.38
    edLeft
    1.38
    하는
    1.36
     করতে
    1.34
    artition
    1.34
    edTest
    1.30
    ت
    1.30
    Act Density 0.592%

    No Known Activations