INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    idis
    -0.07
    WiFi
    -0.07
    .swt
    -0.07
    urope
    -0.07
     bottle
    -0.07
    �게
    -0.07
    iki
    -0.06
    430
    -0.06
    ikt
    -0.06
    Scott
    -0.06
    POSITIVE LOGITS
     lean
    0.12
     Lean
    0.12
    Lean
    0.09
    lean
    0.08
     leaning
    0.08
    _Lean
    0.08
     leaned
    0.08
    ­ing
    0.07
     remain
    0.07
     wan
    0.07
    Act Density 0.002%

    No Known Activations