INDEX
    Explanations

    instances of words related to emphasis and strong assertions in discussions

    New Auto-Interp
    Negative Logits
    PT
    -0.16
    adam
    -0.15
    isas
    -0.15
    verse
    -0.15
    VERSE
    -0.15
    ish
    -0.15
    hood
    -0.15
    chet
    -0.14
    isle
    -0.14
    von
    -0.14
    POSITIVE LOGITS
    ington
    0.17
     point
    0.16
    /max
    0.15
    unden
    0.15
    nox
    0.15
    chrift
    0.15
    erus
    0.15
    rina
    0.15
     points
    0.15
    ingleton
    0.14
    Act Density 0.039%

    No Known Activations