INDEX
Explanations
phrases related to cyber threats and security measures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.14
0.5%
1677
+0.13
0.5%
1034
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1677
+0.14
0.02
1034
+0.13
0.02
1994
+0.11
0.02
Negative Logits
Judd
-0.52
ifflin
-0.51
şam
-0.49
retim
-0.49
çeği
-0.48
chłop
-0.47
DotNetBar
-0.46
tagPool
-0.46
acetic
-0.44
ffindor
-0.44
POSITIVE LOGITS
cyber
1.31
cyber
1.30
Cyber
1.29
Cyber
1.27
exé
1.18
silikon
1.03
ciber
1.01
dovr
1.00
ujedno
0.95
brille
0.95
Activations Density 0.051%