INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Patient
    -0.07
    false
    -0.06
     Lift
    -0.06
    _LR
    -0.06
     Bei
    -0.06
     б
    -0.06
    โรงแรม
    -0.06
    idor
    -0.06
    -0.06
    -0.06
    POSITIVE LOGITS
     wrongdoing
    0.07
    alignment
    0.06
     cứng
    0.06
     исч
    0.06
    0.06
     election
    0.06
     размещ
    0.06
    mj
    0.06
    HeaderCode
    0.06
    -buy
    0.06
    Act Density 0.014%

    No Known Activations