INDEX
    Explanations

    Repetitive/Nonsensical Text

    The neuron strongly activates on the word “hostility.”

    New Auto-Interp
    Negative Logits
    Knight
    -0.07
     Knight
    -0.07
    _MANY
    -0.06
    -phone
    -0.06
     propName
    -0.06
    _pipeline
    -0.06
     ры
    -0.06
    plist
    -0.06
    .tf
    -0.06
     metrics
    -0.06
    POSITIVE LOGITS
    (Py
    0.07
    0.06
    ควร
    0.06
     شمالی
    0.06
     respectfully
    0.06
    quoise
    0.06
    .wp
    0.06
    .ForeColor
    0.06
     Gameplay
    0.06
     Hercules
    0.06
    Act Density 0.013%

    No Known Activations