INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     heath
    -0.09
     ht
    -0.08
    kp
    -0.08
     Mourinho
    -0.08
     skiing
    -0.08
     kp
    -0.07
    .mm
    -0.07
    کش
    -0.07
    upp
    -0.07
    obos
    -0.07
    POSITIVE LOGITS
     Ora
    0.08
    0.07
     miracle
    0.07
    Chan
    0.07
     epoch
    0.07
    Ora
    0.07
     anomalies
    0.07
     US
    0.07
    Su
    0.07
     sello
    0.07
    Act Density 0.000%

    No Known Activations