INDEX
Explanations
phrases related to investigating or discussing background information or details
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
370
+0.13
0.5%
1870
+0.12
0.4%
1334
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
370
+0.13
0.02
1306
+0.12
0.02
1909
+0.12
0.02
Negative Logits
indestru
-0.61
accla
-0.57
Vedi
-0.54
inev
-0.54
Oltre
-0.52
incarcer
-0.52
depic
-0.51
encomp
-0.50
abstrait
-0.50
Abe
-0.49
POSITIVE LOGITS
background
1.43
Background
1.31
background
1.26
backgrounds
1.25
Background
1.18
BACKGROUND
1.11
BACKGROUND
1.04
背景
0.94
bg
0.92
bg
0.85
Activations Density 0.062%