INDEX
    Explanations

    Scornful attitudes

    New Auto-Interp
    Negative Logits
     DIAG
    -0.07
     bursting
    -0.06
     реє
    -0.06
     weave
    -0.06
     feels
    -0.06
     explosive
    -0.06
    .jd
    -0.06
     Readonly
    -0.06
     sector
    -0.06
     Trouble
    -0.06
    POSITIVE LOGITS
     mocking
    0.07
     franc
    0.07
     confidently
    0.07
     defa
    0.07
    eking
    0.07
    ieur
    0.07
    """),↵
    0.07
     scorn
    0.06
     exhibiting
    0.06
     бл
    0.06
    Act Density 0.015%

    No Known Activations