INDEX
Explanations
phrases related to conflict and violence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
964
+0.12
0.3%
1013
+0.11
0.3%
1798
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
166
+0.12
0.05
1013
+0.11
0.07
946
+0.10
0.05
Negative Logits
webElementXpaths
-0.55
idUser
-0.54
গ্রহ
-0.49
ligiloj
-0.48
paralel
-0.48
dió
-0.47
valla
-0.47
diagon
-0.47
paradigma
-0.47
HtmlAttribute
-0.46
POSITIVE LOGITS
partying
0.83
overcrow
0.79
intersper
0.77
indestru
0.76
encomp
0.72
impra
0.72
nightlife
0.72
party
0.71
logitech
0.71
inappro
0.70
Activations Density 0.761%