INDEX
Explanations
words related to online security measures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.08
0.3%
437
+0.06
0.2%
325
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
976
+0.08
0.05
325
+0.06
0.05
284
+0.06
0.04
Negative Logits
<bos>
-1.30
reinstate
-0.89
abolish
-0.85
rehabilitate
-0.76
inaugurate
-0.76
/**
-0.74
plundered
-0.73
endow
-0.73
defray
-0.71
enshr
-0.71
POSITIVE LOGITS
Secure
1.50
secure
1.43
Secure
1.42
secure
1.41
bandung
1.21
security
1.20
cæ
1.09
napoli
1.08
jaya
1.07
securely
1.04
Activations Density 0.124%