INDEX
    Explanations

    descriptions of the size, weight, and construction materials of objects

    phrases that refer to objects or devices

    New Auto-Interp
    Negative Logits
    911
    -0.66
    dding
    -0.65
    course
    -0.64
     Corpus
    -0.64
    traumatic
    -0.63
    castle
    -0.62
     Guant
    -0.62
    priv
    -0.62
     Union
    -0.61
    Priv
    -0.60
    POSITIVE LOGITS
    unes
    1.03
    chy
    1.01
     seems
    1.00
    alian
    1.00
    'll
    0.99
    self
    0.98
    theless
    0.96
     doesnt
    0.90
    's
    0.89
     appears
    0.88
    Act Density 0.282%

    No Known Activations