INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     تضيفلها
    -0.91
    GEBURTSDATUM
    -0.89
    Cox
    -0.85
     Eucharist
    -0.79
    aarrggbb
    -0.77
    RegressionTest
    -0.76
     Cox
    -0.72
     AppCompatTheme
    -0.68
     cox
    -0.67
     nakalista
    -0.67
    POSITIVE LOGITS
     Forces
    0.49
    ztály
    0.47
     Lit
    0.47
    ernos
    0.47
    رب
    0.47
    Conserv
    0.47
    şekkür
    0.46
    воро
    0.45
     Out
    0.44
    țele
    0.44
    Act Density 0.014%

    No Known Activations