INDEX
    Explanations

    self-referential and irony

    philosophical statements that present self-referential contradictions or paradoxes.

    This neuron responds to mentions of sentences or statements that refer to their own truthfulness or paradoxical self-reference.

    New Auto-Interp
    Negative Logits
     fem
    -0.07
     hrd
    -0.06
     Rousse
    -0.06
    -water
    -0.06
    eptal
    -0.06
     cheaper
    -0.06
     watts
    -0.06
    -0.06
    ्वत
    -0.06
    .weixin
    -0.06
    POSITIVE LOGITS
    .getMethod
    0.07
     đỏ
    0.06
    -depth
    0.06
     Rect
    0.06
     erotica
    0.06
    นวย
    0.06
    ullo
    0.06
    .coordinates
    0.06
    avax
    0.06
     Carolyn
    0.06
    Act Density 0.199%

    No Known Activations