INDEX
    Explanations

    phrases expressing overt dishonesty or bold statements that are clearly untrue

    New Auto-Interp
    Negative Logits
    CodedInputStream
    -0.86
     насељу
    -0.77
    évaluateur
    -0.70
     بيها
    -0.67
    WriteBarrier
    -0.66
    الدراسه
    -0.64
    InjectAttribute
    -0.63
    رشف
    -0.63
     useStyles
    -0.62
    DeleteBehavior
    -0.62
    POSITIVE LOGITS
     outright
    1.04
     downright
    0.80
     blatant
    0.70
     blatantly
    0.66
     overt
    0.66
     openly
    0.60
     gross
    0.57
     explicit
    0.57
     completely
    0.57
     totally
    0.57
    Act Density 0.714%

    No Known Activations