INDEX
Explanations
phrases containing the word "face"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1047
+0.19
0.8%
67
+0.19
0.8%
1865
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1047
+0.19
0.04
67
+0.19
0.04
629
+0.14
0.02
Negative Logits
Krieger
-0.42
Andrey
-0.42
Sosa
-0.41
Elise
-0.41
Burnett
-0.41
Segura
-0.41
Kelsey
-0.40
Elise
-0.40
Queen
-0.40
desen
-0.39
POSITIVE LOGITS
fa
0.97
Fa
0.92
lapto
0.90
Fa
0.88
maksi
0.85
faw
0.85
traktor
0.82
Italij
0.81
fas
0.80
frankfurt
0.80
Activations Density 0.156%