INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    DataAnnotations
    -0.53
    RegressionTest
    -0.45
     inoxydable
    -0.43
    ctest
    -0.42
     bricolaje
    -0.41
    Claro
    -0.41
     espèce
    -0.40
    NCF
    -0.40
     desmotivaciones
    -0.39
    getMenuInflater
    -0.39
    POSITIVE LOGITS
     originally
    0.64
    الدراسه
    0.58
     Originally
    0.57
    originally
    0.56
     originalmente
    0.56
    0.54
     previously
    0.52
    原本
    0.52
     Was
    0.50
    enderror
    0.50
    Act Density 0.106%

    No Known Activations