INDEX
    Explanations

    statements of truth or affirmation

    New Auto-Interp
    Negative Logits
     Monfieur
    -0.83
    AnchorStyles
    -0.80
     يتيمه
    -0.80
     Bernadette
    -0.77
     avoient
    -0.73
     ejus
    -0.70
    Collegamenti
    -0.69
    __*/
    -0.68
     côtés
    -0.68
    Ivo
    -0.67
    POSITIVE LOGITS
     True
    1.39
     true
    1.39
     TRUE
    1.31
     Tru
    1.18
    True
    1.17
    TRUE
    1.13
     False
    1.11
    Tru
    1.11
    stdbool
    1.07
     truer
    1.06
    Act Density 0.091%

    No Known Activations