INDEX
    Explanations

    This neuron fires on the French definite article “La.”

    New Auto-Interp
    Negative Logits
    explain
    -0.08
     Of
    -0.08
     thù
    -0.07
    tower
    -0.07
    _RF
    -0.07
     of
    -0.06
    itize
    -0.06
    ισμ
    -0.06
     tavsiye
    -0.06
     devs
    -0.06
    POSITIVE LOGITS
     La
    0.10
     El
    0.10
    Les
    0.09
    La
    0.08
    FormGroup
    0.08
    El
    0.08
     Les
    0.08
     Las
    0.07
     Le
    0.07
    	Get
    0.07
    Act Density 0.040%

    No Known Activations