INDEX
    Explanations

    trained models

    This neuron specifically detects the word “trained,” especially when it appears as part of “pre-trained.”

    phrases describing the characteristics and functionalities of generative language models.

    New Auto-Interp
    Negative Logits
    );}
    -0.06
     ayar
    -0.06
     McCoy
    -0.06
     invade
    -0.06
    _OVERRIDE
    -0.06
    okt
    -0.06
     pravděpodob
    -0.06
     #+#
    -0.06
    ringe
    -0.06
    -feature
    -0.06
    POSITIVE LOGITS
    387
    0.06
    DBC
    0.06
    org
    0.06
    ikip
    0.06
    _Tag
    0.06
    	val
    0.06
    0.06
    θρώ
    0.06
    _len
    0.06
    Toronto
    0.06
    Act Density 0.001%

    No Known Activations