INDEX
Explanations
controversial topics and societal issues related to freedom and politics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
856
+0.24
0.8%
1343
+0.24
0.8%
674
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.24
0.03
1842
+0.24
0.05
198
+0.15
0.05
Negative Logits
shenan
-1.08
reluct
-1.07
disagre
-1.04
sophistic
-1.01
practition
-0.98
contribut
-0.97
volunte
-0.95
attemp
-0.94
depic
-0.93
uninten
-0.93
POSITIVE LOGITS
useRouter
0.73
umerable
0.72
CiNii
0.72
useAuth
0.71
referrerpolicy
0.71
appunt
0.70
BIBSYS
0.69
meravigli
0.68
ovunque
0.67
defaultstate
0.67
Activations Density 0.381%