INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     najle
    -0.07
     is
    -0.07
    —is
    -0.07
     was
    -0.07
    	ms
    -0.06
    _ij
    -0.06
    38
    -0.06
     beers
    -0.06
     mq
    -0.06
     ppl
    -0.06
    POSITIVE LOGITS
     have
    0.08
     Having
    0.08
     Have
    0.08
    Have
    0.07
    Experience
    0.07
    GRAPH
    0.07
     Гар
    0.07
    leur
    0.07
    having
    0.07
    Having
    0.07
    Act Density 0.060%

    No Known Activations