INDEX
Explanations
proper nouns associated with news articles and reports, including names of individuals and organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.17
0.5%
1919
+0.14
0.4%
1097
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.17
0.09
1919
+0.14
0.07
1097
+0.11
0.07
Negative Logits
bordeaux
-0.90
fluo
-0.79
oleo
-0.79
cioc
-0.78
soggior
-0.75
ouret
-0.74
ecru
-0.73
uccio
-0.71
vanil
-0.70
Conc
-0.69
POSITIVE LOGITS
explains
0.66
commented
0.65
explained
0.62
oversaw
0.62
disagreed
0.61
kasarigan
0.61
testified
0.60
says
0.59
remarked
0.59
said
0.59
Activations Density 0.288%