INDEX
    Explanations

    the presence of the word "false"

    New Auto-Interp
    Negative Logits
     وتسجيلات
    -0.87
    tamment
    -0.79
    +#+#
    -0.78
     Hecht
    -0.77
     chré
    -0.76
     Magi
    -0.75
     Puig
    -0.71
    زیین
    -0.69
    ---+
    -0.69
     EClass
    -0.68
    POSITIVE LOGITS
     false
    1.17
    false
    1.00
     False
    0.94
     fals
    0.89
    False
    0.86
    FALSE
    0.86
     FALSE
    0.82
     falsehood
    0.79
    ation
    0.79
     falsely
    0.79
    Act Density 0.100%

    No Known Activations