INDEX
Explanations
information related to translations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.08
0.3%
32
+0.07
0.3%
1376
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1689
+0.08
0.04
360
+0.07
0.03
644
+0.07
0.02
Negative Logits
<bos>
-0.93
ⓧ
-0.80
-0.73
#
-0.69
/*
-0.66
/**
-0.65
//
-0.57
public
-0.57
void
-0.56
<?
-0.56
POSITIVE LOGITS
translation
2.10
Translation
1.95
translations
1.92
translated
1.89
translating
1.85
Translation
1.78
Translations
1.78
aen
1.77
Translated
1.75
translate
1.74
Activations Density 0.080%