INDEX
    Explanations

    adjectives describing intensity or extremity

    extensive use of the word "so" to express strong emphasis or degree

    New Auto-Interp
    Negative Logits
    works
    -0.70
    theless
    -0.69
    nings
    -0.67
    ulia
    -0.63
    amac
    -0.63
     excerpts
    -0.61
     eviction
    -0.61
     Flavoring
    -0.60
     coincides
    -0.60
     prompts
    -0.59
    POSITIVE LOGITS
    bered
    1.15
    ooo
    1.14
    oooo
    1.13
    oths
    1.05
    oooooooo
    1.02
    othes
    0.97
    oooooooooooooooo
    0.95
     far
    0.89
    othe
    0.86
    othing
    0.86
    Act Density 0.068%

    No Known Activations