INDEX
Explanations
phrases that signal transitions or shifts in conversation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1757
+0.15
0.5%
1438
+0.14
0.5%
1472
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1757
+0.15
0.03
971
+0.14
0.03
1473
+0.11
0.02
Negative Logits
Obras
-0.57
Arqu
-0.56
Composición
-0.56
Fech
-0.56
bró
-0.55
jint
-0.54
Fis
-0.54
Propiedades
-0.54
醐
-0.54
Programa
-0.53
POSITIVE LOGITS
impra
1.48
maneu
1.47
tolerably
1.41
gaily
1.41
indestru
1.39
unce
1.39
disreg
1.33
unspeak
1.32
unve
1.32
madonna
1.31
Activations Density 0.072%