INDEX
Explanations
keywords related to occupying or fulfilling a role or position
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
878
+0.11
0.4%
1381
+0.11
0.4%
168
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1381
+0.11
0.03
878
+0.11
0.03
168
+0.10
0.03
Negative Logits
saper
-0.54
cammin
-0.54
riman
-0.51
uniqu
-0.51
raso
-0.50
marte
-0.50
introd
-0.49
Legge
-0.49
usuf
-0.48
sopr
-0.48
POSITIVE LOGITS
fill
1.37
fills
1.29
filling
1.29
fill
1.26
Fill
1.21
filled
1.20
filling
1.18
Filling
1.15
filled
1.15
FILL
1.14
Activations Density 0.100%