INDEX
Explanations
self-referential and irony
philosophical statements that present self-referential contradictions or paradoxes.
This neuron responds to mentions of sentences or statements that refer to their own truthfulness or paradoxical self-reference.
New Auto-Interp
Negative Logits
fem
-0.07
hrd
-0.06
Rousse
-0.06
-water
-0.06
eptal
-0.06
cheaper
-0.06
watts
-0.06
↵
-0.06
्वत
-0.06
.weixin
-0.06
POSITIVE LOGITS
.getMethod
0.07
đỏ
0.06
-depth
0.06
Rect
0.06
erotica
0.06
นวย
0.06
ullo
0.06
.coordinates
0.06
avax
0.06
Carolyn
0.06
Activations Density 0.199%