INDEX
    Explanations

    This neuron fires on occurrences of “arxiv” (i.e. references to arXiv preprints or arxiv.org links).

    New Auto-Interp
    Negative Logits
    -0.07
     FullName
    -0.07
     Fon
    -0.06
     bổ
    -0.06
    MATRIX
    -0.06
     SCALE
    -0.06
    аза
    -0.06
     Fuß
    -0.06
    _deep
    -0.06
    ubat
    -0.06
    POSITIVE LOGITS
    history
    0.08
    .history
    0.07
     refrigerator
    0.07
     Pixar
    0.07
    0.07
     allowNull
    0.06
     직접
    0.06
     Chef
    0.06
    iv
    0.06
     ide
    0.06
    Act Density 0.001%

    No Known Activations