INDEX
Explanations
questions or interrogative statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
874
+0.11
0.4%
201
+0.10
0.3%
537
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
201
+0.11
0.03
1601
+0.10
0.04
332
+0.10
0.04
Negative Logits
nicolas
-0.62
pixabay
-0.58
migli
-0.55
shutterstock
-0.55
-0.54
darah
-0.54
aiuta
-0.54
recensioni
-0.53
apparti
-0.52
avviene
-0.51
POSITIVE LOGITS
question
1.24
questions
1.20
Question
1.15
question
1.09
Questions
1.04
Question
1.04
questions
1.03
QUESTIONS
0.96
QUESTION
0.95
Questions
0.93
Activations Density 0.100%