INDEX
Explanations
mentions of specific locations or actions related to official government statements and operations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.14
0.4%
1150
+0.13
0.4%
1535
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1506
+0.14
0.04
314
+0.13
0.03
395
+0.12
0.04
Negative Logits
virtù
-1.08
quegli
-0.99
ecru
-0.97
swarovski
-0.94
autunno
-0.91
vece
-0.90
hairc
-0.89
affitto
-0.86
oleo
-0.84
isSuccess
-0.82
POSITIVE LOGITS
Dalam
0.60
dalam
0.59
Odpo
0.57
Història
0.56
Conteúdo
0.55
Fö
0.55
Lorsqu
0.54
Mű
0.54
În
0.54
Atsauces
0.54
Activations Density 0.138%