INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     a
    1.38
    0
    1.20
    un
    1.14
     an
    1.04
     it
    1.02
     A
    1.02
     as
    1.01
     by
    0.96
    ่า
    0.93
    0.91
    POSITIVE LOGITS
    ,
    1.27
     bumping
    1.22
    bump
    1.15
     bumped
    1.12
    aile
    1.09
     bump
    1.08
    u
    1.08
    ه
    1.08
    ளில்
    1.05
    t
    0.98
    Act Density 0.003%

    No Known Activations