INDEX
Explanations
phrases related to adverse behavior or incidents
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.21
0.6%
1577
+0.13
0.4%
876
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
453
+0.21
0.03
1417
+0.13
0.02
137
+0.10
0.02
Negative Logits
Mlle
-1.42
Abbé
-1.39
Juf
-1.37
Souha
-1.35
reluct
-1.35
indestru
-1.32
Áng
-1.29
unspeak
-1.29
inconce
-1.28
shenan
-1.28
POSITIVE LOGITS
EndInit
0.79
NSCoder
0.68
GraphicsUnit
0.67
BeginInit
0.59
CodedInputStream
0.59
商品説明
0.58
Espèce
0.56
consulté
0.56
isContained
0.56
ComVisible
0.55
Activations Density 0.072%