INDEX
    Explanations

    This neuron detects mentions of copying or plagiarism—that is, words indicating stolen ideas or intellectual‐property infringement.

    New Auto-Interp
    Negative Logits
     foot
    -0.07
     있었
    -0.06
    570
    -0.06
    .boost
    -0.06
    970
    -0.06
     spray
    -0.06
     robots
    -0.06
    _df
    -0.06
     Davies
    -0.06
    Adobe
    -0.06
    POSITIVE LOGITS
    paginate
    0.07
     doubly
    0.07
     اطل
    0.06
     dern
    0.06
    >J
    0.06
     folly
    0.06
    Editable
    0.06
    0.06
     особи
    0.06
    'elle
    0.06
    Act Density 0.072%

    No Known Activations