INDEX
    Explanations

    Nationalities and locations

    This neuron detects words referring to nationalities or place-based demonyms (e.g., Milanese, Pacific, Ocean).

    New Auto-Interp
    Negative Logits
    $class
    -0.07
     참가
    -0.07
    制造
    -0.07
     saf
    -0.07
    机械
    -0.06
     naveg
    -0.06
    -0.06
    ثال
    -0.06
     soo
    -0.06
     Kore
    -0.06
    POSITIVE LOGITS
     GNOME
    0.07
    ัพท
    0.07
    #SBATCH
    0.06
    (sound
    0.06
    InThe
    0.06
    (bucket
    0.06
    _##
    0.06
     repos
    0.06
    (di
    0.05
    nte
    0.05
    Act Density 0.278%

    No Known Activations