INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     کتاب
    -0.06
    сен
    -0.06
    vection
    -0.06
    eten
    -0.06
     brightest
    -0.06
    obra
    -0.06
     neob
    -0.06
    variant
    -0.06
     instant
    -0.06
     mcc
    -0.06
    POSITIVE LOGITS
    Sorry
    0.08
    _trigger
    0.06
     tăng
    0.06
    -found
    0.06
    Vel
    0.06
    DIRECT
    0.06
    0.06
     excit
    0.06
     Sorry
    0.06
    enthal
    0.06
    Act Density 0.042%

    No Known Activations