INDEX
Explanations
untreated
This neuron is specifically triggered by the word “untreated” (marking mentions of untreated conditions).
New Auto-Interp
Negative Logits
Animator
-0.07
){↵↵-0.07
голов
-0.06
(listener
-0.06
.onclick
-0.06
Log
-0.06
еріга
-0.06
[random
-0.06
۳۶
-0.06
самое
-0.06
POSITIVE LOGITS
untreated
0.07
rence
0.06
athletic
0.06
_shell
0.06
Offices
0.06
ий
0.06
osterone
0.06
pylab
0.06
_Server
0.06
behaved
0.06
Activations Density 0.009%