INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Ry
    -0.08
     Buffered
    -0.08
     extend
    -0.07
     vibe
    -0.07
    ль
    -0.07
    مش
    -0.07
     Wind
    -0.07
    Finger
    -0.07
    nh
    -0.07
    Flower
    -0.07
    POSITIVE LOGITS
     substitutions
    0.12
     substitution
    0.11
     substitute
    0.11
     sustit
    0.11
     sustitu
    0.10
     Substitute
    0.10
     SUBSTITUTE
    0.09
     substitutes
    0.09
     substit
    0.09
     đổi
    0.09
    Act Density 0.007%

    No Known Activations