INDEX
Explanations
sentences indicating personal experiences and emotional growth
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
481
+0.13
0.5%
1757
+0.13
0.5%
1837
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1837
+0.13
0.05
481
+0.13
0.04
370
+0.12
0.04
Negative Logits
effe
-0.87
perfet
-0.77
convenable
-0.77
lidl
-0.76
alre
-0.74
acce
-0.74
secon
-0.74
purcha
-0.73
fuf
-0.73
Jä
-0.73
POSITIVE LOGITS
down
1.07
down
1.03
DOWN
1.03
Down
1.01
DOWN
0.96
Down
0.92
downs
0.89
downs
0.83
ダウン
0.76
Downing
0.71
Activations Density 0.106%