INDEX
Explanations
references to different approaches to solving problems or addressing issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
1407
+0.14
0.8%
397
+0.11
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1407
+0.17
0.03
370
+0.14
0.03
397
+0.11
0.03
Negative Logits
<bos>
-3.33
ⓧ
-0.89
/**
-0.86
/***
-0.81
/*
-0.80
<?
-0.77
protected
-0.75
///**
-0.74
declare
-0.71
public
-0.68
POSITIVE LOGITS
madonna
1.60
maroc
1.58
stockholm
1.55
casio
1.54
affor
1.53
tupperware
1.50
scrat
1.50
jurassic
1.50
strick
1.49
snoopy
1.49
Activations Density 0.070%