INDEX
Explanations
mentions of the word "drone"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1757
+0.12
0.4%
1323
+0.09
0.3%
1052
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
690
+0.12
0.04
1343
+0.09
0.04
1363
+0.08
0.03
Negative Logits
public
-0.76
class
-0.76
itemize
-0.74
-0.73
-0.73
<bos>
-0.73
-0.72
private
-0.72
-0.71
for
-0.71
POSITIVE LOGITS
maneu
2.38
milf
2.25
increa
2.21
impra
2.20
strick
2.20
hentai
2.17
affor
2.17
disreg
2.15
reluct
2.15
emphat
2.11
Activations Density 0.338%