INDEX
Explanations
discussions related to political tensions and threats of conflict
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.21
0.6%
2034
+0.19
0.6%
1150
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1535
+0.21
0.09
382
+0.19
0.07
1445
+0.15
0.08
Negative Logits
hairc
-1.26
tupperware
-1.19
embodi
-1.15
tricot
-1.11
cushi
-1.11
Darum
-1.09
ecru
-1.06
hoody
-1.05
scrat
-1.05
amigurumi
-1.04
POSITIVE LOGITS
UseVisualStyle
0.69
induk
0.66
]})
0.65
His
0.64
AnchorTagHelper
0.63
beker
0.62
matematika
0.62
He
0.62
He
0.60
Kategor
0.60
Activations Density 0.462%