INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Telefono
    -0.07
    StateMachine
    -0.07
     người
    -0.07
     coke
    -0.07
     todo
    -0.06
     politics
    -0.06
    likes
    -0.06
     makeover
    -0.06
     heart
    -0.06
     novice
    -0.06
    POSITIVE LOGITS
     Proj
    0.07
     Jill
    0.07
    0.07
    plaint
    0.07
     κύ
    0.07
    orsch
    0.06
    114
    0.06
    ilians
    0.06
    yla
    0.06
    _TBL
    0.06
    Act Density 0.001%

    No Known Activations