INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ness
    -0.78
     potential
    -0.71
     formel
    -0.69
    ...
    -0.66
    ...
    
    -0.63
     __(
    -0.62
    footnote
    -0.62
     قدم
    -0.62
     Kalyan
    -0.62
    ....
    -0.60
    POSITIVE LOGITS
     USA
    2.09
    USA
    1.97
     usa
    1.34
     Usa
    1.25
     للاسماء
    1.12
    usa
    1.10
    Usa
    1.06
    RTLD
    0.93
     USAID
    0.90
     USSR
    0.87
    Act Density 0.078%

    No Known Activations