INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Life
    -0.08
     Life
    -0.07
    465
    -0.06
    figcaption
    -0.06
    -0.06
    445
    -0.06
    756
    -0.06
    Stuff
    -0.06
     folk
    -0.06
    bins
    -0.06
    POSITIVE LOGITS
     anterior
    0.13
     posterior
    0.09
    erior
    0.08
    ner
    0.07
     AR
    0.07
     posters
    0.07
     antis
    0.07
     ant
    0.07
    NotExist
    0.07
     antenna
    0.07
    Act Density 0.004%

    No Known Activations