INDEX
Explanations
mentions of locations or events related to security threats
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1026
+0.12
0.4%
699
+0.12
0.4%
971
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
699
+0.12
0.05
1026
+0.12
0.05
359
+0.10
0.04
Negative Logits
ecru
-0.94
venezuela
-0.92
swarovski
-0.92
orlando
-0.85
geforce
-0.84
createDate
-0.82
murano
-0.82
userType
-0.80
inappro
-0.80
Bartholo
-0.78
POSITIVE LOGITS
according
0.96
according
0.86
<bos>
0.82
According
0.72
According
0.69
selon
0.62
IconModule
0.61
según
0.59
μφωνα
0.57
accordance
0.55
Activations Density 0.039%