INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     होता
    -0.08
     secondly
    -0.08
     doux
    -0.08
    نون
    -0.08
     ఉంటుంది
    -0.08
     Fuk
    -0.07
     pena
    -0.07
     lòng
    -0.07
     Gis
    -0.07
     José
    -0.07
    POSITIVE LOGITS
     geraten
    0.08
     horrible
    0.08
     reading
    0.08
     apparatuur
    0.08
     लगाए
    0.08
     Spe
    0.08
    _finish
    0.07
    会议
    0.07
     finish
    0.07
    设备
    0.07
    Act Density 0.000%

    No Known Activations