INDEX
    Explanations

    instances of the word "false"

    phrases related to false claims or deception

    New Auto-Interp
    Negative Logits
    hens
    -1.00
    hetti
    -0.94
    ODY
    -0.84
    imen
    -0.82
    scene
    -0.80
    mun
    -0.79
     guiActiveUnfocused
    -0.79
    APTER
    -0.76
    arya
    -0.74
    ILE
    -0.73
    POSITIVE LOGITS
     accuser
    0.94
     false
    0.93
     guiActiveUn
    0.92
     unfocusedRange
    0.88
     positives
    0.87
     misrepresent
    0.84
     dich
    0.83
     falsely
    0.80
     guiIcon
    0.78
    false
    0.77
    Act Density 0.014%

    No Known Activations