INDEX
Explanations
mentions of statistical data, trends, and analysis on societal or economic issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.26
1.0%
1253
+0.09
0.3%
1499
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1253
+0.26
0.04
327
+0.09
0.06
1726
+0.07
0.04
Negative Logits
<bos>
-2.80
ⓧ
-0.92
<?
-0.86
-0.81
/***
-0.78
/**
-0.74
Transkript
-0.69
<?
-0.65
Transcripción
-0.63
ždý
-0.61
POSITIVE LOGITS
Muhamma
1.03
seksi
0.94
Bekasi
0.91
kafe
0.90
jati
0.90
jawa
0.89
taha
0.88
Minang
0.88
saha
0.87
Banjar
0.87
Activations Density 0.904%