INDEX
Explanations
political and governmental terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.11
0.3%
74
+0.08
0.2%
855
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
74
+0.11
0.03
1948
+0.08
0.04
16
+0.07
0.05
Negative Logits
impra
-0.90
increa
-0.86
scrat
-0.83
shenan
-0.82
reluct
-0.81
unden
-0.80
encomp
-0.80
philanth
-0.78
disreg
-0.77
downvoted
-0.77
POSITIVE LOGITS
<bos>
0.64
maig
0.60
glicher
0.57
FunctionFlags
0.56
ability
0.55
Obrigada
0.55
Xar
0.55
Obrigado
0.53
lectual
0.52
Autoritní
0.51
Activations Density 0.411%