INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Forget
    -0.07
    ppt
    -0.07
    افق
    -0.07
    Regex
    -0.06
    population
    -0.06
    Borders
    -0.06
    istema
    -0.06
    _EX
    -0.06
    Gain
    -0.06
     theat
    -0.06
    POSITIVE LOGITS
    .writeInt
    0.07
    746
    0.06
     Griffin
    0.06
     underwater
    0.06
     "_"
    0.06
     Innoc
    0.06
    .SendMessage
    0.06
    СО
    0.06
    :this
    0.06
     mình
    0.06
    Act Density 0.015%

    No Known Activations