INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Bobby
    -0.08
    _HS
    -0.06
    phone
    -0.06
     Microsoft
    -0.06
     Robbie
    -0.06
     одна
    -0.06
     East
    -0.06
     CONTR
    -0.06
    ж
    -0.06
    DO
    -0.06
    POSITIVE LOGITS
     suốt
    0.07
    aliyet
    0.06
     patiently
    0.06
    0.06
    _repo
    0.06
     Ödül
    0.06
    openssl
    0.06
     epile
    0.06
    wanted
    0.06
    들도
    0.06
    Act Density 0.012%

    No Known Activations