INDEX
    Explanations

    phrases indicating special emphasis or importance

    emphasis on specific topics or attributes

    New Auto-Interp
    Negative Logits
    ylon
    -0.68
    ences
    -0.66
     essentially
    -0.65
    CT
    -0.64
    ensibly
    -0.63
    ruary
    -0.63
     substitutes
    -0.63
    only
    -0.62
     Anarchy
    -0.62
    offic
    -0.61
    POSITIVE LOGITS
     egregious
    1.09
     noteworthy
    1.04
     suited
    0.93
     notable
    0.90
     troublesome
    0.90
     noticeable
    0.88
     susceptible
    0.85
     poignant
    0.84
     advantageous
    0.82
     vulnerable
    0.82
    Act Density 0.050%

    No Known Activations