INDEX
    Explanations

    punctuation

    This neuron never activates—it does not respond to any token.

    New Auto-Interp
    Negative Logits
     compares
    -0.07
     ايران
    -0.07
     allocations
    -0.07
    ाच
    -0.07
    Latitude
    -0.06
     configur
    -0.06
    Resources
    -0.06
    елов
    -0.06
    .tech
    -0.06
     ")↵
    -0.06
    POSITIVE LOGITS
    екотор
    0.07
     Lebens
    0.06
    	done
    0.06
     pylab
    0.06
    -flag
    0.06
    licht
    0.06
     Dio
    0.06
    -HT
    0.06
    bel
    0.06
    ungen
    0.06
    Act Density 0.021%

    No Known Activations