INDEX
Explanations
verbs related to taking decisive or forceful action
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
683
+0.11
0.3%
1526
+0.10
0.3%
1839
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
683
+0.11
0.06
678
+0.10
0.05
1056
+0.09
0.05
Negative Logits
Rine
-0.76
Timp
-0.72
intersper
-0.71
Amé
-0.69
pymysql
-0.69
Incenti
-0.69
Daven
-0.67
McInt
-0.67
Mlle
-0.66
Stefanie
-0.66
POSITIVE LOGITS
ing
2.15
ING
1.59
ting
1.01
uing
0.93
ging
0.92
ingan
0.91
ingi
0.89
ning
0.89
ings
0.87
ed
0.85
Activations Density 0.400%