INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sticks
    0.79
     Sticks
    0.63
     differentiator
    0.63
    0.59
     beliebt
    0.58
    eem
    0.55
    ۔
    0.55
    ਾਈ
    0.54
     meest
    0.54
     ইপিআর
    0.54
    POSITIVE LOGITS
    R
    1.09
    O
    1.06
    E
    1.04
    ح
    1.04
    F
    1.01
    H
    1.00
    ل
    0.97
    л
    0.96
    G
    0.95
    0.95
    Act Density 0.001%

    No Known Activations