INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    expect
    -0.07
     kicker
    -0.06
    _ts
    -0.06
    _gift
    -0.06
    .ul
    -0.06
    ubic
    -0.06
     peter
    -0.06
    .error
    -0.06
     Yue
    -0.05
    354
    -0.05
    POSITIVE LOGITS
    ITO
    0.07
    _DEBUG
    0.06
     wik
    0.06
     мит
    0.06
    0.06
    .UserName
    0.06
    stellung
    0.06
    0.06
    ken
    0.06
    	arg
    0.06
    Act Density 0.015%

    No Known Activations