INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Macro
    -0.08
    macro
    -0.07
    Oper
    -0.07
     "-"↵
    -0.07
     col
    -0.07
    Macro
    -0.07
     circular
    -0.06
     al
    -0.06
     Circular
    -0.06
     macro
    -0.06
    POSITIVE LOGITS
     strength
    0.16
     Strength
    0.16
     strengths
    0.13
    Strength
    0.12
    strength
    0.12
    _strength
    0.10
    rength
    0.10
    یت
    0.08
     Tina
    0.08
    力を
    0.07
    Act Density 0.011%

    No Known Activations