INDEX
    Explanations

    the word 'ug' at varying activation levels

    repeated mentions of the term "Guggenheim."

    New Auto-Interp
    Negative Logits
    peak
    -0.70
     Leilan
    -0.69
     calming
    -0.65
     compr
    -0.62
    cape
    -0.61
     infancy
    -0.61
     wards
    -0.61
     blocker
    -0.60
     targ
    -0.60
     Hemp
    -0.60
    POSITIVE LOGITS
    glers
    1.42
    uese
    1.20
    gery
    1.15
    uay
    1.15
    ged
    1.08
    ging
    1.06
    nant
    1.00
    gers
    0.99
    ger
    0.96
    ats
    0.93
    Act Density 0.012%

    No Known Activations