INDEX
    Explanations

    references to helmets

    New Auto-Interp
    Negative Logits
    ween
    -0.69
    tery
    -0.68
    VD
    -0.68
    quart
    -0.68
    atoes
    -0.67
    uality
    -0.67
    agents
    -0.67
     Roosevelt
    -0.66
    hower
    -0.66
    aceae
    -0.66
    POSITIVE LOGITS
     helmets
    1.08
     helmet
    1.04
     worn
    1.02
     goggles
    0.88
     wearer
    0.87
     Helmet
    0.84
     equipped
    0.80
     adorned
    0.79
     mask
    0.76
     masks
    0.75
    Act Density 0.024%

    No Known Activations