INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     autof
    -0.08
     Pry
    -0.08
     fellows
    -0.07
     Foley
    -0.07
     constructing
    -0.07
     pupils
    -0.07
    -fa
    -0.07
     paj
    -0.07
     misc
    -0.07
    -0.07
    POSITIVE LOGITS
    0.08
    0.08
    sic
    0.08
    bread
    0.07
    HQ
    0.07
    0.07
    (fr
    0.07
     emple
    0.07
     Abl
    0.07
     desto
    0.07
    Act Density 0.044%

    No Known Activations