INDEX
Explanations
contact information and instructions in an online message
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1328
+0.12
0.4%
1671
+0.12
0.4%
1127
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1127
+0.12
0.05
1865
+0.12
0.04
1806
+0.10
0.05
Negative Logits
Violon
-0.89
Meille
-0.84
hunde
-0.80
peculi
-0.79
marte
-0.77
optik
-0.77
poliester
-0.76
parati
-0.75
meras
-0.75
Tenis
-0.74
POSITIVE LOGITS
alternatively
0.90
equivalently
0.77
alternately
0.68
bidden
0.67
conversely
0.66
else
0.66
optionally
0.65
somethin
0.61
choose
0.58
<?
0.57
Activations Density 0.188%