INDEX
    Explanations

    Insults and criticism

    New Auto-Interp
    Negative Logits
     Ireland
    -0.07
     Panic
    -0.06
     flood
    -0.06
     bases
    -0.06
    -exc
    -0.06
     quý
    -0.06
     widen
    -0.06
     Constitution
    -0.06
     Era
    -0.06
    _NUMERIC
    -0.06
    POSITIVE LOGITS
    after
    0.07
    ाख
    0.06
    Reference
    0.06
    ackage
    0.06
     tokenizer
    0.06
    someone
    0.06
    _ELEM
    0.06
    
    0.06
    about
    0.06
    
    0.06
    Act Density 0.007%

    No Known Activations