INDEX
Explanations
descriptions and characteristics of individuals in a news article
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.11
0.3%
856
+0.10
0.3%
453
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
648
+0.11
0.04
227
+0.10
0.05
743
+0.10
0.03
Negative Logits
lele
-1.54
bordeaux
-1.51
soggior
-1.49
milano
-1.48
nutr
-1.46
ivi
-1.44
doman
-1.44
tanga
-1.40
mef
-1.38
canel
-1.37
POSITIVE LOGITS
likewise
0.98
similarly
0.95
meanwhile
0.88
also
0.88
agrees
0.85
echoed
0.81
agree
0.81
agreed
0.78
тоже
0.78
disagreed
0.75
Activations Density 0.262%