INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     political
    -0.08
     Cou
    -0.07
    ్ల
    -0.07
     Iran
    -0.07
     Classes
    -0.07
    Classes
    -0.07
    ng
    -0.07
    acent
    -0.07
    hd
    -0.07
     targeted
    -0.07
    POSITIVE LOGITS
     birthplace
    0.09
     segíts
    0.08
     البداية
    0.08
    年代
    0.08
     حجر
    0.08
     reloj
    0.08
     скор
    0.08
     masterpiece
    0.08
     piedra
    0.08
    Require
    0.07
    Act Density 0.000%

    No Known Activations