INDEX
Explanations
words related to data security and privacy policies on websites
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.20
0.9%
2034
+0.16
0.8%
406
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1013
+0.20
0.10
1183
+0.16
0.05
2034
+0.09
0.08
Negative Logits
<bos>
-3.41
/***
-0.68
/*!
-0.62
broaden
-0.62
restore
-0.61
deepen
-0.61
build
-0.60
bestow
-0.60
rehabilitate
-0.58
reclaim
-0.57
POSITIVE LOGITS
stockholm
1.18
bandung
1.15
panama
1.12
ibiza
1.11
nuoc
1.10
toledo
1.09
jaya
1.08
Minang
1.07
hcm
1.07
tucson
1.07
Activations Density 0.958%