INDEX
Explanations
names of a specific tennis player, "Maria Sharapova"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1896
+0.16
1.0%
478
+0.14
0.9%
50
+0.14
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.16
0.11
478
+0.14
0.05
1616
+0.14
0.07
Negative Logits
<bos>
-2.67
habhar
-0.88
amaño
-0.80
queline
-0.72
ħħ
-0.69
ffè
-0.69
isuke
-0.67
uldron
-0.66
ándor
-0.59
lv
-0.59
POSITIVE LOGITS
McLaugh
1.18
Abbé
1.16
unwarran
1.06
sovere
1.06
increa
1.05
Shakspeare
1.03
Bartholo
1.01
unce
1.00
stockholm
0.99
inev
0.99
Activations Density 1.224%