INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Autumn
    -0.09
    202
    -0.08
    _AURA
    -0.07
     portion
    -0.07
     autumn
    -0.07
     Watson
    -0.07
     Peterson
    -0.07
     Henderson
    -0.07
     sunt
    -0.07
     protr
    -0.07
    POSITIVE LOGITS
     like
    0.16
     Like
    0.14
    like
    0.13
     LIKE
    0.13
    -like
    0.11
    Like
    0.11
     unlike
    0.09
     lik
    0.09
    ike
    0.09
    _like
    0.09
    Act Density 0.081%

    No Known Activations