INDEX
Explanations
phrases related to news articles and stories
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2011
+0.14
0.5%
11
+0.12
0.4%
1839
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2011
+0.14
0.05
1516
+0.12
0.04
406
+0.12
0.03
Negative Logits
intit
-0.69
amitié
-0.64
colonie
-0.63
pompe
-0.59
Ressource
-0.55
unsplash
-0.54
poème
-0.53
canne
-0.53
/**
-0.53
serre
-0.53
POSITIVE LOGITS
story
1.41
story
1.32
Story
1.29
Story
1.29
stories
1.26
stories
1.13
STORY
1.10
STORY
1.03
Stories
1.02
Stories
0.98
Activations Density 0.096%