INDEX
Explanations
information about a specific topic, such as white tea, fishing policy, and ALS diagnosis, through instructional or informative text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1141
+0.07
0.2%
1085
+0.07
0.2%
1271
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1354
+0.07
0.02
225
+0.07
0.02
1137
+0.07
0.02
Negative Logits
burst
-0.90
bursts
-0.69
burst
-0.62
onPause
-0.60
Burst
-0.58
EqualsAnd
-0.57
bursting
-0.56
expand
-0.56
Burst
-0.55
flich
-0.54
POSITIVE LOGITS
overwhelming
1.92
whelming
1.41
maroc
1.35
stockholm
1.34
lele
1.33
meis
1.32
oner
1.31
ordina
1.30
fup
1.29
wien
1.27
Activations Density 0.147%