INDEX
Explanations
personal blog excerpts
This neuron activates on explicit mentions of the character’s sexual identity or erotic descriptors (e.g. gay, homosexual, slut).
New Auto-Interp
Negative Logits
slightly
-0.07
.textAlignment
-0.07
Alpha
-0.06
swaps
-0.06
programas
-0.06
(simp
-0.06
SW
-0.06
сф
-0.06
wür
-0.06
中文字幕
-0.06
POSITIVE LOGITS
olsa
0.07
sup
0.07
illum
0.07
.lastName
0.06
Shipping
0.06
erin
0.06
Except
0.06
yne
0.06
_ERRORS
0.06
iman
0.06
Activations Density 0.023%