INDEX
Explanations
instances where the phrase "I can" is followed by a verb
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
1.3%
1124
+0.11
0.6%
805
+0.09
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1124
+0.25
0.08
1974
+0.11
0.08
1415
+0.09
0.07
Negative Logits
<bos>
-3.17
/***
-0.80
ⓧ
-0.69
<tfoot>
-0.68
<?
-0.66
-0.61
//};
-0.59
EndProject
-0.58
/**
-0.58
})();
-0.57
POSITIVE LOGITS
affor
1.17
accla
1.13
practition
1.12
saar
1.08
fatis
1.08
ecru
1.07
impra
1.06
maneu
1.05
maroc
1.03
stockholm
1.03
Activations Density 0.457%