INDEX
    Explanations

    This neuron detects words expressing apology (e.g., “apologize,” “sorry,” “apologizing”).

    New Auto-Interp
    Negative Logits
    .httpClient
    -0.06
    _CFG
    -0.06
    자동
    -0.06
    اطعة
    -0.06
    ЛЬ
    -0.06
     userInput
    -0.06
    NullOrEmpty
    -0.06
     GNOME
    -0.06
    $password
    -0.06
    _duration
    -0.06
    POSITIVE LOGITS
     colspan
    0.07
    pector
    0.06
     возникнов
    0.06
    ोष
    0.06
    |
    0.06
     Thành
    0.06
     той
    0.06
     Period
    0.06
    .Directory
    0.06
    NH
    0.06
    Act Density 0.013%

    No Known Activations