INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     било
    -0.08
     empê
    -0.08
     conductive
    -0.08
    ieun
    -0.08
     عليه
    -0.08
     prevented
    -0.08
    lectron
    -0.07
    ாட
    -0.07
     charcoal
    -0.07
    ாள
    -0.07
    POSITIVE LOGITS
     anúncio
    0.08
     Caption
    0.08
    ında
    0.08
    Dropdown
    0.08
     Política
    0.08
     Nutr
    0.08
    0.07
    ıç
    0.07
    mal
    0.07
    0.07
    Act Density 0.009%

    No Known Activations