INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Merkel
    -0.07
     Comfort
    -0.07
     kov
    -0.06
     frontline
    -0.06
     outrageous
    -0.06
     Kinect
    -0.06
     kriz
    -0.06
     завжди
    -0.06
     Hình
    -0.06
     Zhu
    -0.06
    POSITIVE LOGITS
    باب
    0.07
     season
    0.06
     hlavou
    0.06
     lament
    0.06
     snapshot
    0.06
    :value
    0.06
     skillet
    0.06
    CTOR
    0.06
    ์น
    0.06
    ejte
    0.06
    Act Density 0.045%

    No Known Activations