INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ivan
    -0.07
     cores
    -0.07
    Button
    -0.06
    ertil
    -0.06
     wiped
    -0.06
    -reg
    -0.06
     Raven
    -0.06
    aven
    -0.06
    motor
    -0.06
    .Content
    -0.06
    POSITIVE LOGITS
     полож
    0.07
    _Pos
    0.07
     deluxe
    0.07
    .poly
    0.07
     Mattis
    0.07
     contiene
    0.07
     Besch
    0.06
    (use
    0.06
    нак
    0.06
     Plus
    0.06
    Act Density 0.009%

    No Known Activations