INDEX
Explanations
mathematical operations like addition, subtraction, multiplication, and division within text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.24
0.7%
1699
+0.15
0.5%
2034
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.24
0.06
1510
+0.15
0.05
981
+0.14
0.06
Negative Logits
<bos>
-0.97
Примеча
-0.97
asteroide
-0.90
Glej
-0.88
Legături
-0.88
انظر
-0.85
Pozri
-0.82
Fácil
-0.82
Сю
-0.81
Izvori
-0.81
POSITIVE LOGITS
!...
1.38
?...
1.36
fto
1.29
mef
1.29
fta
1.29
unil
1.27
fluo
1.27
squa
1.27
sii
1.27
mme
1.26
Activations Density 0.321%