INDEX
    Explanations

    words related to unconfirmed or disputed claims

    New Auto-Interp
    Negative Logits
    adan
    -0.91
    utics
    -0.79
    oton
    -0.77
    ctors
    -0.77
    lished
    -0.75
    uden
    -0.74
    atever
    -0.74
    cair
    -0.74
    ertodd
    -0.74
    perature
    -0.72
    POSITIVE LOGITS
     inability
    0.96
     impossibility
    0.94
     contradiction
    0.90
     culprit
    0.88
     absence
    0.86
     lack
    0.86
     failings
    0.83
     contradictions
    0.83
     perpetrator
    0.81
     innocence
    0.80
    Act Density 0.126%

    No Known Activations