INDEX
Explanations
phrases indicating controversy or public reaction to events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.34
1.2%
198
+0.08
0.3%
1499
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
669
+0.34
0.04
1531
+0.08
0.04
1136
+0.08
0.04
Negative Logits
<bos>
-2.25
ⓧ
-0.66
underval
-0.47
enrich
-0.47
Assays
-0.46
Decorate
-0.46
achieve
-0.46
endwhile
-0.45
<?
-0.44
hisz
-0.44
POSITIVE LOGITS
Muhamma
1.13
Minang
1.02
Jambi
0.96
Khart
0.95
Palembang
0.93
Keny
0.93
smtplib
0.92
Juf
0.92
Banjar
0.92
Nusantara
0.91
Activations Density 0.499%