INDEX
    Explanations

    phrases related to completion or accomplishment

    instances of the word "out."

    New Auto-Interp
    Negative Logits
     arsen
    -0.82
     tyr
    -0.66
     grooming
    -0.64
     resil
    -0.62
    avery
    -0.59
     turnover
    -0.59
     oxid
    -0.59
     Pry
    -0.58
    avorite
    -0.57
     metallic
    -0.57
    POSITIVE LOGITS
    doors
    1.06
    fitted
    1.01
    lier
    0.97
    door
    0.97
    stretched
    0.96
    casts
    0.95
    skirts
    0.90
    dated
    0.88
    flow
    0.88
    fits
    0.87
    Act Density 0.016%

    No Known Activations