INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     دفاع
    -0.07
    FontAwesome
    -0.07
     może
    -0.06
     voll
    -0.06
     Wan
    -0.06
    æ
    -0.06
    emaakt
    -0.06
     людини
    -0.06
    using
    -0.06
     pile
    -0.06
    POSITIVE LOGITS
     hardship
    0.33
     hardships
    0.24
     Hancock
    0.11
    jQuery
    0.07
    ưỡng
    0.07
     fighting
    0.07
    etag
    0.07
     Gingrich
    0.07
    ighting
    0.07
    .tie
    0.07
    Act Density 0.001%

    No Known Activations