INDEX
Explanations
news articles on a specific event or topic
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1445
+0.11
0.3%
1150
+0.10
0.3%
1097
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1997
+0.11
0.07
613
+0.10
0.06
357
+0.10
0.06
Negative Logits
ù
-0.79
vola
-0.70
assoluto
-0.69
herbes
-0.67
boxe
-0.67
impegno
-0.66
pira
-0.66
frans
-0.66
meras
-0.66
calo
-0.66
POSITIVE LOGITS
unspeak
0.86
disagre
0.86
shenan
0.85
apprehen
0.83
ineffec
0.82
intersper
0.82
reluct
0.79
reported
0.77
reports
0.77
vainly
0.75
Activations Density 0.215%