INDEX
    Explanations

    dialogue snippets

    The neuron detects polite address or apology phrases (e.g. “Madam,” “I’m sorry,” etc.).

    New Auto-Interp
    Negative Logits
       	
    -0.07
     pow
    -0.07
    .inverse
    -0.07
    find
    -0.07
    ний
    -0.06
    iera
    -0.06
     jew
    -0.06
    ween
    -0.06
     например
    -0.06
    culated
    -0.06
    POSITIVE LOGITS
    (/^\
    0.07
    _FLUSH
    0.07
     вступ
    0.06
    BASH
    0.06
     testim
    0.06
    едак
    0.06
    capture
    0.06
    .DELETE
    0.06
     uncont
    0.06
     WaitForSeconds
    0.06
    Act Density 0.011%

    No Known Activations