INDEX
Explanations
political and governmental terms related to power and control
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.10
0.3%
752
+0.10
0.3%
198
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1518
+0.10
0.04
198
+0.10
0.05
16
+0.09
0.06
Negative Logits
shenan
-0.71
upvoted
-0.63
disagre
-0.61
naï
-0.61
fucker
-0.59
<bos>
-0.59
cuck
-0.58
blushed
-0.57
ineffec
-0.57
motherfucker
-0.56
POSITIVE LOGITS
autunno
0.73
virtù
0.70
regardant
0.66
Amérique
0.66
abbra
0.66
appuy
0.65
écout
0.65
onore
0.62
Ngb
0.62
considération
0.62
Activations Density 0.472%