INDEX
Explanations
phrases related to activities, especially sports and performances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.08
0.3%
783
+0.06
0.2%
1074
+0.05
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
783
+0.08
0.04
371
+0.06
0.04
1675
+0.05
0.04
Negative Logits
<bos>
-1.26
public
-0.82
ⓧ
-0.77
void
-0.77
if
-0.77
const
-0.76
///**
-0.76
//{
-0.75
private
-0.75
return
-0.75
POSITIVE LOGITS
affor
2.77
maneu
2.72
increa
2.63
accla
2.50
impra
2.44
excru
2.41
guarante
2.40
reluct
2.40
fuf
2.40
wherea
2.38
Activations Density 0.179%