INDEX
Explanations
phrases related to user experience
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.13
0.4%
599
+0.10
0.3%
166
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.13
0.04
662
+0.10
0.04
1013
+0.10
0.04
Negative Logits
Juf
-1.05
viciss
-0.90
javier
-0.87
Khart
-0.85
Aprile
-0.83
alberto
-0.82
emphat
-0.79
jorge
-0.79
Augu
-0.79
ftu
-0.75
POSITIVE LOGITS
experience
0.96
Experience
0.85
experiences
0.81
EXPERIENCE
0.77
experience
0.76
Experience
0.74
lebnis
0.72
Experiences
0.69
体验
0.67
perience
0.67
Activations Density 0.353%