INDEX
Explanations
Text information or descriptions related to various characters or profiles in different scenarios
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1103
+0.13
0.5%
1839
+0.12
0.5%
1677
+0.11
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1103
+0.13
0.03
492
+0.12
0.03
1895
+0.11
0.02
Negative Logits
MessageOf
-0.47
cademic
-0.46
volup
-0.46
CascadeType
-0.46
Schuh
-0.45
ConstraintMaker
-0.45
withIdentifier
-0.45
ijuana
-0.45
TargetException
-0.44
Jahrgang
-0.44
POSITIVE LOGITS
text
1.39
Text
1.28
text
1.25
texts
1.21
Text
1.20
TEXT
1.18
TEXT
1.18
texted
1.03
texte
1.02
Texts
1.01
Activations Density 0.067%