INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Penguin
    -0.07
     Streaming
    -0.07
     trench
    -0.06
     analytic
    -0.06
     Frame
    -0.06
    adf
    -0.06
    lig
    -0.06
    uggage
    -0.06
    athlete
    -0.06
     Green
    -0.06
    POSITIVE LOGITS
     честь
    0.07
     inventor
    0.07
    τια
    0.07
     obsessed
    0.06
    èmes
    0.06
     производ
    0.06
     цен
    0.06
     UIG
    0.06
     там
    0.06
    brightness
    0.06
    Act Density 0.006%

    No Known Activations