INDEX
Explanations
references to smoke and smoking actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1376
+0.20
0.8%
528
+0.15
0.6%
1350
+0.15
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1376
+0.20
0.03
1416
+0.15
0.03
1306
+0.15
0.03
Negative Logits
Sklici
-0.73
الحره
-0.55
كومونز
-0.53
Literatuur
-0.53
Zunanje
-0.52
Glej
-0.50
Tril
-0.49
censiti
-0.48
Mediabestanden
-0.48
Curt
-0.47
POSITIVE LOGITS
smoke
1.37
Smoke
1.21
smokes
1.19
smoking
1.17
smoked
1.13
smokers
1.12
smoke
1.12
Smoke
1.11
smoker
1.09
Smoking
1.07
Activations Density 0.054%