INDEX
    Explanations

    instances of the word "ols" with varying activation levels

    mentions of "pistols" in various contexts

    New Auto-Interp
    Negative Logits
    ŃĶ
    -0.72
     Rapp
    -0.70
    office
    -0.66
    liest
    -0.62
    FACE
    -0.58
     Pentagon
    -0.58
     French
    -0.57
    Chair
    -0.57
     Kenyan
    -0.57
    PRESS
    -0.56
    POSITIVE LOGITS
    ols
    1.39
    olics
    1.04
    ength
    1.00
    terday
    0.99
    ongs
    0.91
    olic
    0.90
    atile
    0.90
    ands
    0.89
    ipop
    0.88
    allery
    0.85
    Act Density 0.010%

    No Known Activations