INDEX
Explanations
mentions of hot beverages, particularly tea, and their potential health risks
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1253
+0.15
0.5%
964
+0.14
0.4%
227
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1253
+0.15
0.03
102
+0.14
0.05
939
+0.12
0.09
Negative Logits
TagMode
-0.66
twimg
-0.54
Catalana
-0.53
Paglinawan
-0.51
">—
-0.51
EIO
-0.50
condividere
-0.50
totul
-0.50
fiducia
-0.49
compone
-0.48
POSITIVE LOGITS
Shakspeare
1.05
unwarran
1.04
Juf
0.96
Souha
0.94
plenti
0.93
Thos
0.93
Inhabitants
0.92
McLaugh
0.91
reluct
0.90
Shaksp
0.89
Activations Density 1.728%