INDEX
Explanations
culture and humor
The neuron detects informal humor and pop-culture cues—words indicating jokes, sarcasm, or cultural references.
New Auto-Interp
Negative Logits
Routing
-0.07
msgs
-0.06
Lights
-0.06
Signals
-0.06
streams
-0.06
[~,
-0.06
elapsed
-0.06
Narc
-0.06
Broadcast
-0.06
idiots
-0.06
POSITIVE LOGITS
wrestling
0.07
agram
0.06
ouri
0.06
оні
0.06
615
0.06
Ellie
0.06
沢
0.06
.Queue
0.06
itating
0.06
acie
0.05
Activations Density 0.003%