INDEX
    Explanations

    journal publications

    The neuron activates on tokens typical of scholarly citations—phrases introducing or referencing scientific studies (e.g. “According to a study published in the Journal of…”).

    New Auto-Interp
    Negative Logits
    -Bar
    -0.07
    rin
    -0.07
    cef
    -0.07
    ø
    -0.07
    ufig
    -0.07
    clone
    -0.06
    apo
    -0.06
     Hunt
    -0.06
    àng
    -0.06
    enko
    -0.06
    POSITIVE LOGITS
     favorite
    0.07
    0.07
    0.06
     Optionally
    0.06
    최신
    0.06
     zih
    0.06
     Million
    0.06
     acompan
    0.06
     clases
    0.06
     MVC
    0.06
    Act Density 0.010%

    No Known Activations