INDEX
Explanations
mentions of sports signings and related discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1253
+0.13
0.4%
108
+0.09
0.3%
392
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
392
+0.13
0.04
1501
+0.09
0.04
1820
+0.09
0.04
Negative Logits
diarias
-0.62
élaboration
-0.57
geving
-0.56
spira
-0.55
hals
-0.55
Gesellschaft
-0.54
interprétation
-0.53
apartament
-0.53
accesibles
-0.52
vold
-0.52
POSITIVE LOGITS
Souha
0.82
hentai
0.82
hoody
0.78
depic
0.75
hairc
0.75
Ahh
0.73
Xoxo
0.73
apprehen
0.73
racon
0.72
broderie
0.72
Activations Density 0.313%