INDEX
    Explanations

    This neuron detects disclaimer language stating (non-)affiliation or endorsement (e.g., “not affiliated,” “endorsed by,” “authorized,” etc.).

    New Auto-Interp
    Negative Logits
    inary
    -0.08
    metric
    -0.07
    .tabs
    -0.07
    =""
    -0.06
    subscription
    -0.06
    -0.06
    .pipe
    -0.06
     alimentos
    -0.06
    其中
    -0.06
    気が
    -0.06
    POSITIVE LOGITS
     frm
    0.07
     EVT
    0.07
     fen
    0.07
     Ib
    0.07
     systemd
    0.07
    ۱۹۴
    0.06
    ЛИ
    0.06
    .FAIL
    0.06
     qt
    0.06
    _IV
    0.06
    Act Density 0.003%

    No Known Activations