INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.08
    2:0.06
    3:0.08
    4:0.07
    5:0.07
    6:0.08
    7:0.09
    8:0.09
    9:0.08
    10:0.09
    11:0.07
    Negative Logits
     Chan
    -2.16
    clips
    -2.09
    natureconservancy
    -2.05
     cx
    -1.95
     Parkinson
    -1.95
     Emin
    -1.94
     Chess
    -1.93
     Lady
    -1.93
     Inquis
    -1.93
     Codex
    -1.92
    POSITIVE LOGITS
    sembly
    2.21
    :]
    2.09
    ught
    2.08
    NRS
    2.02
    aler
    1.92
     spit
    1.92
    upid
    1.88
    ensical
    1.84
     setup
    1.84
    raft
    1.83
    Act Density 0.000%

    No Known Activations