INDEX
    Explanations

    the word "bag" with a high activation level

    New Auto-Interp
    Negative Logits
    vironment
    -0.72
    hower
    -0.67
    terday
    -0.66
     Galile
    -0.65
    issance
    -0.65
     relations
    -0.64
     spect
    -0.62
     preschool
    -0.61
    nesota
    -0.60
     spectrum
    -0.60
    POSITIVE LOGITS
    ging
    1.23
    bag
    1.19
    gie
    1.15
    gery
    1.13
    ged
    1.12
    bags
    1.10
    pipe
    1.05
    glers
    0.99
    rill
    0.96
    gers
    0.96
    Act Density 0.016%

    No Known Activations