INDEX
    Explanations

    phrases associated with claims or statements being validated or invalidated

    misinformation and falsehoods

    New Auto-Interp
    Negative Logits
    AndEndTag
    -0.70
    ValueStyle
    -0.65
    ItemBackground
    -0.62
    IsMutable
    -0.61
    tagHelperRunner
    -0.60
    WebElementEntity
    -0.57
    ViewFeatures
    -0.56
    principalColumn
    -0.55
    twimg
    -0.54
    EndContext
    -0.50
    POSITIVE LOGITS
     reality
    0.44
     actually
    0.44
     reversed
    0.42
     actuality
    0.42
    reality
    0.40
     dụ
    0.38
     fic
    0.38
     realidad
    0.38
     refuted
    0.37
    Reality
    0.37
    Act Density 0.163%

    No Known Activations