INDEX
Explanations
email and online communication-related phrases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.13
0.4%
2019
+0.10
0.3%
478
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
873
+0.13
0.04
1471
+0.10
0.06
418
+0.10
0.02
Negative Logits
Hermans
-0.56
Recept
-0.52
Vrij
-0.52
Gorb
-0.51
Bergmann
-0.51
Unger
-0.50
Takk
-0.50
anti
-0.50
Werth
-0.50
mvh
-0.49
POSITIVE LOGITS
Settembre
1.19
Ottobre
1.18
Venise
1.10
parteci
1.07
siff
1.06
Luglio
1.05
Giugno
1.03
Aprile
0.97
trion
0.94
morire
0.91
Activations Density 0.202%