INDEX
Explanations
phrases related to policies, procedures, and governmental actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
381
+0.11
0.4%
32
+0.10
0.3%
1328
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1590
+0.11
0.05
32
+0.10
0.05
950
+0.10
0.05
Negative Logits
jaya
-0.88
fta
-0.88
susun
-0.83
hina
-0.82
miu
-0.81
bayan
-0.81
ftre
-0.81
bandung
-0.77
kasa
-0.76
saha
-0.76
POSITIVE LOGITS
no
0.69
كومونز
0.60
nothing
0.59
plenty
0.57
UINTN
0.54
/**
0.54
Theres
0.50
no
0.50
expandindo
0.50
ample
0.50
Activations Density 0.139%