INDEX
Explanations
pronouns and phrases expressing personal feelings or experiences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.28
1.1%
381
+0.11
0.4%
1262
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
658
+0.28
0.08
1262
+0.11
0.07
805
+0.10
0.07
Negative Logits
<bos>
-2.35
/***
-0.67
webElementXpaths
-0.60
HasIndex
-0.60
ⓧ
-0.57
<?
-0.56
lateinit
-0.55
Aholisi
-0.55
},{
-0.53
-------------</
-0.53
POSITIVE LOGITS
Minang
1.18
bandung
1.15
jawa
1.09
Muhamma
1.00
haer
0.98
meis
0.94
aen
0.94
lele
0.93
zoll
0.92
Banjar
0.92
Activations Density 0.500%