INDEX
Explanations
news related to conflict and violence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1499
+0.18
0.5%
946
+0.14
0.4%
184
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1499
+0.18
0.09
946
+0.14
0.08
1018
+0.12
0.05
Negative Logits
hairc
-1.46
tricot
-1.37
swarovski
-1.32
tupperware
-1.32
ecru
-1.25
hoody
-1.24
cushi
-1.23
boop
-1.20
scrat
-1.19
broderie
-1.16
POSITIVE LOGITS
Noter
0.61
Galería
0.61
Misión
0.60
yesterday
0.60
تضيفلها
0.57
Economía
0.56
Consejo
0.55
Sair
0.55
Litteratur
0.54
Biografía
0.54
Activations Density 0.572%