INDEX
Explanations
recommendations or suggestions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
605
+0.13
0.5%
1068
+0.12
0.4%
1425
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1425
+0.13
0.03
559
+0.12
0.03
1276
+0.11
0.02
Negative Logits
Geld
-0.47
Sitten
-0.47
tiens
-0.46
prends
-0.45
обходи
-0.44
pourrais
-0.43
Prezzo
-0.42
ähkö
-0.42
Sess
-0.42
DAR
-0.42
POSITIVE LOGITS
recommendation
1.00
Recommendation
0.97
recommendations
0.96
recommending
0.96
Recommend
0.92
recommend
0.91
Recommendations
0.90
recommended
0.87
Recommended
0.87
recommends
0.86
Activations Density 0.078%