INDEX
    Explanations

    claims or statements that are unsupported or unsubstantiated

    New Auto-Interp
    Negative Logits
     actionGroup
    -0.90
    ktop
    -0.76
    isites
    -0.74
    itaire
    -0.72
    ivation
    -0.72
    ahime
    -0.70
     mobility
    -0.69
    ivating
    -0.69
    rontal
    -0.68
    itect
    -0.68
    POSITIVE LOGITS
     debunked
    1.30
     falsehood
    1.23
     assertions
    1.20
    False
    1.18
    Claim
    1.14
     debunk
    1.13
     untrue
    1.13
     misinformation
    1.13
     baseless
    1.11
     claims
    1.10
    Act Density 0.587%

    No Known Activations