INDEX
    Explanations

    discussions about truth, false claims, and accusations

    True/false statements

    identifying false statements

    New Auto-Interp
    Negative Logits
    AndEndTag
    -0.69
     виправивши
    -0.68
    OGND
    -0.68
     EconPapers
    -0.67
    >",
    
    -0.67
    LookAnd
    -0.59
     transpa
    -0.57
    ]**
    -0.55
    })));
    -0.55
     ProgressDialog
    -0.54
    POSITIVE LOGITS
     untrue
    1.39
     false
    1.16
    false
    1.13
     incorrect
    1.12
     FALSE
    1.06
    False
    1.03
     False
    1.03
     falso
    1.01
     falsehood
    1.00
    FALSE
    1.00
    Act Density 0.280%

    No Known Activations