INDEX
Explanations
phrases related to legal issues and criminal activities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.18
0.6%
184
+0.17
0.6%
674
+0.16
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.18
0.03
324
+0.17
0.03
651
+0.16
0.02
Negative Logits
mef
-1.21
ivi
-1.18
glan
-1.17
franz
-1.15
daf
-1.13
dises
-1.11
fluo
-1.10
„,
-1.10
ghe
-1.09
dispen
-1.08
POSITIVE LOGITS
itself
0.64
сьогодні
0.56
собенности
0.56
собенно
0.55
engagent
0.53
nogen
0.53
lcccccc
0.53
lccccc
0.50
alone
0.49
themselves
0.49
Activations Density 0.283%