INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    hev
    -0.07
    ーツ
    -0.07
    男人
    -0.06
     oppon
    -0.06
    eli
    -0.06
     seviy
    -0.06
    -0.06
    퓨터
    -0.06
    _part
    -0.06
    _coef
    -0.06
    POSITIVE LOGITS
     simmer
    0.07
    _SUCCESS
    0.07
     voiced
    0.06
    _GRANTED
    0.06
     Sugar
    0.06
    0.06
     coworkers
    0.06
    пос
    0.06
    RG
    0.06
    cipher
    0.06
    Act Density 0.002%

    No Known Activations