INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    bye
    -0.68
    Words
    -0.64
     Cuomo
    -0.63
    History
    -0.61
     symmetry
    -0.60
    Finish
    -0.60
    obbies
    -0.60
    enforcement
    -0.59
    period
    -0.59
    sen
    -0.59
    POSITIVE LOGITS
    acle
    1.22
    pole
    1.22
     tent
    1.18
    atively
    1.15
     encamp
    1.05
     tents
    1.03
     pole
    0.98
    acles
    0.95
     poles
    0.87
     flap
    0.86
    Act Density 0.009%

    No Known Activations