INDEX
Explanations
conversations
conversational dynamics involving taboo topics or fantasies.
This neuron detects tokens that occur in quoted or spoken dialogue (i.e. inside quotation marks or speaker turns).
New Auto-Interp
Negative Logits
redesign
-0.07
Millennium
-0.06
очных
-0.06
Offensive
-0.06
Stars
-0.06
409
-0.06
ketogenic
-0.06
Sands
-0.06
formatter
-0.06
.getProperty
-0.06
POSITIVE LOGITS
LO
0.07
PP
0.07
AN
0.07
dahi
0.06
eater
0.06
_TIMES
0.06
_pel
0.06
imedia
0.06
GNUNET
0.06
procrast
0.06
Activations Density 0.111%