INDEX
Explanations
phrases related to criticism and negativity
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.18
0.5%
604
+0.10
0.3%
198
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
83
+0.18
0.03
835
+0.10
0.01
1455
+0.09
0.03
Negative Logits
accla
-1.11
milf
-1.10
apprehen
-1.10
encomp
-1.08
jurassic
-1.07
depic
-1.07
disagre
-1.06
increa
-1.04
reluct
-1.02
emphat
-1.01
POSITIVE LOGITS
censiti
0.70
setVerticalGroup
0.63
ImageContext
0.62
GraphicsUnit
0.59
+#+
0.59
calculées
0.59
Obrázky
0.58
CppMethod
0.57
دیکھیے
0.56
Wikimédia
0.56
Activations Density 0.382%