INDEX
Explanations
mentions of smoke-related terms or phrases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1376
+0.16
0.9%
50
+0.16
0.9%
1350
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1376
+0.16
0.03
1350
+0.16
0.03
1306
+0.14
0.03
Negative Logits
<bos>
-3.23
const
-0.73
ⓧ
-0.70
ll
-0.69
/*
-0.68
int
-0.68
-0.68
Enllaços
-0.67
Além
-0.67
enumerate
-0.67
POSITIVE LOGITS
stockholm
1.82
maneu
1.76
hcm
1.76
accla
1.70
impra
1.67
shenan
1.66
Juf
1.65
carrefour
1.64
dises
1.62
bourg
1.61
Activations Density 0.053%