INDEX
Explanations
reality versus fiction
This neuron activates on mentions of “human,” specifically when the text refers to the agent as a human being rather than an AI.
New Auto-Interp
Negative Logits
children
-0.07
EP
-0.06
.setBackgroundResource
-0.06
ECB
-0.06
@{↵-0.06
WaitForSeconds
-0.06
Pipe
-0.06
moderated
-0.06
_past
-0.06
ors
-0.06
POSITIVE LOGITS
@pytest
0.07
*dt
0.06
Εθν
0.06
_firstname
0.06
Calif
0.06
τευ
0.06
foo
0.06
字段
0.06
окумент
0.06
QIcon
0.06
Activations Density 0.023%