INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    \Traits
    -0.07
     Stranger
    -0.06
     windshield
    -0.06
    .story
    -0.06
     Predicate
    -0.06
    Versions
    -0.06
     unconscious
    -0.06
    -0.06
     oscill
    -0.06
     UPDATED
    -0.06
    POSITIVE LOGITS
    *I
    0.07
    .are
    0.07
    _fore
    0.07
     entering
    0.06
    fare
    0.06
     '['
    0.06
     supplementation
    0.06
     क
    0.06
     kter
    0.06
    かり
    0.06
    Act Density 0.007%

    No Known Activations