INDEX
    Explanations

    superlatives and evaluations such as "most," "difficult," "favorite," "best," "worst," "consistent," "irritating," "popular," "powerful," and "prominent" in a text

    New Auto-Interp
    Negative Logits
    skirts
    -0.78
    kas
    -0.71
    thur
    -0.71
    rompt
    -0.65
    pload
    -0.63
    heid
    -0.63
    selves
    -0.63
    blocks
    -0.63
    roth
    -0.63
    too
    -0.63
    POSITIVE LOGITS
     imaginable
    0.99
     conceivable
    0.93
     possible
    0.86
     practicable
    0.79
    seller
    0.79
     ever
    0.77
     Wanted
    0.75
    liest
    0.71
     notable
    0.71
    ream
    0.70
    Act Density 2.122%

    No Known Activations