INDEX
Explanations
celebrities, movies, and music-related information and news
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
0.7%
924
+0.10
0.3%
130
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1511
+0.18
0.03
924
+0.10
0.07
1862
+0.09
0.05
Negative Logits
<bos>
-2.10
GIH
-0.73
złotych
-0.73
rungsseite
-0.67
████
-0.64
principalColumn
-0.62
abetes
-0.62
Varint
-0.62
Попис
-0.60
ⓧ
-0.59
POSITIVE LOGITS
reluct
1.73
Bartholo
1.72
pamph
1.69
affor
1.69
impra
1.68
philanth
1.65
unden
1.64
indestru
1.61
maneu
1.61
increa
1.61
Activations Density 0.367%