INDEX
Explanations
text related to predictions and consequences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
344
+0.09
0.3%
623
+0.08
0.2%
231
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
231
+0.09
0.04
1460
+0.08
0.04
623
+0.08
0.03
Negative Logits
kasa
-0.76
Okt
-0.71
fers
-0.70
lapto
-0.69
lele
-0.69
traktor
-0.69
karton
-0.66
kac
-0.65
laf
-0.65
kade
-0.64
POSITIVE LOGITS
predictions
1.22
predicting
1.19
prediction
1.17
predict
1.10
Predictions
1.07
predictive
1.07
predicts
1.04
Prediction
1.02
predicted
1.02
forecasting
1.01
Activations Density 0.457%