INDEX
Explanations
phrases related to political figures or names with repeated sequences of letters like "oo", "ee", "eth"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1276
+0.15
0.6%
1872
+0.14
0.6%
421
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1276
+0.15
0.04
1872
+0.14
0.03
421
+0.13
0.03
Negative Logits
Pří
-0.48
Personendaten
-0.45
كومونز
-0.44
المعيارى
-0.42
Punt
-0.41
FieldDescriptor
-0.41
eleste
-0.41
tagHelperRunner
-0.41
руйте
-0.40
Зміст
-0.40
POSITIVE LOGITS
Ro
1.09
Ro
1.02
ro
1.01
RO
0.95
Rovers
0.86
ros
0.85
ROS
0.85
roti
0.79
roam
0.78
Roach
0.77
Activations Density 0.151%