INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Snapshot
    -0.09
     Burlington
    -0.08
    EPT
    -0.08
     differenti
    -0.08
     snapshot
    -0.08
    ->____
    -0.08
     Cot
    -0.08
    aneous
    -0.08
    ystate
    -0.07
    .snapshot
    -0.07
    POSITIVE LOGITS
    -shaped
    0.09
    0.08
    beer
    0.08
     metaph
    0.08
     Seems
    0.08
     Beethoven
    0.07
     sands
    0.07
    0.07
     tiers
    0.07
     schlafen
    0.07
    Act Density 0.003%

    No Known Activations