INDEX
Explanations
instances of the word "endeavor" in a context where new ventures, research, or projects are being discussed
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.07
0.2%
505
+0.06
0.2%
1797
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
630
+0.07
0.02
710
+0.06
0.02
191
+0.06
0.02
Negative Logits
<bos>
-0.77
protected
-0.73
itemize
-0.72
displayquote
-0.71
public
-0.71
"..\..\..\
-0.70
//}
-0.70
static
-0.69
frac
-0.68
cout
-0.67
POSITIVE LOGITS
maneu
2.21
shenan
2.17
milf
2.13
🤣🤣
2.10
affor
2.10
increa
2.09
attemp
2.08
scrat
2.06
strick
2.02
madonna
2.01
Activations Density 0.056%