INDEX
Explanations
words that end in 'ing' and 'ed' or phrases related to government, politics, and societal issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.31
2.1%
1967
+0.23
1.5%
1363
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.31
0.14
1356
+0.23
0.07
1622
+0.13
0.06
Negative Logits
<bos>
-3.23
<?
-1.04
ⓧ
-1.03
/**
-0.98
-0.94
/*!
-0.81
/*
-0.81
/***
-0.80
<?
-0.79
/*++
-0.71
POSITIVE LOGITS
bandung
1.53
Minang
1.46
jaya
1.43
lele
1.40
Juf
1.35
surabaya
1.25
accla
1.25
jawa
1.25
seksi
1.23
Confu
1.22
Activations Density 1.446%