INDEX
    Explanations

    sentences starting with "The truth is"

    statements or assertions that express a fact or truth

    New Auto-Interp
    Negative Logits
    throp
    -0.85
    iates
    -0.72
     Improvement
    -0.63
    icipated
    -0.62
    viol
    -0.61
    stad
    -0.61
     derog
    -0.61
    ypes
    -0.60
    ivil
    -0.59
     violation
    -0.58
    POSITIVE LOGITS
     borne
    0.78
     neither
    0.77
     not
    0.74
     probably
    0.73
     unclear
    0.72
     none
    0.71
     indeed
    0.70
     undoubtedly
    0.68
     still
    0.67
     doubtless
    0.67
    Act Density 0.101%

    No Known Activations