INDEX
    Explanations

    phrases or words indicating a false justification or reason for a particular action or belief

    terms related to justifications or excuses for actions

    New Auto-Interp
    Negative Logits
    elong
    -0.74
    Life
    -0.70
    itivity
    -0.70
    Die
    -0.69
     evolve
    -0.68
    igr
    -0.63
    AMI
    -0.63
    average
    -0.62
    Surv
    -0.62
    life
    -0.62
    POSITIVE LOGITS
     pretext
    3.84
     guise
    2.09
     spurious
    1.38
     bogus
    1.36
     provocation
    1.29
     disguise
    1.24
     phony
    1.24
     dubious
    1.22
     pretended
    1.18
     euphem
    1.18
    Act Density 0.027%

    No Known Activations