INDEX
    Explanations

    phrases related to disbelief or rejection of claims

    terms related to claims that are considered invalid, unsupported, or lacking credibility

    New Auto-Interp
    Negative Logits
    udeau
    -0.83
    illes
    -0.81
    hov
    -0.79
    odon
    -0.73
     Franç
    -0.71
    anse
    -0.71
    oise
    -0.71
    eeper
    -0.70
    toc
    -0.69
    irie
    -0.69
    POSITIVE LOGITS
     baseless
    1.03
     unfounded
    1.01
    False
    0.91
     allegations
    0.90
     accusations
    0.88
    Rum
    0.85
    ãĥ¥
    0.84
     false
    0.83
     allegation
    0.82
     theories
    0.80
    Act Density 0.022%

    No Known Activations