INDEX
Explanations
words related to entertainment and performances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
528
+0.10
0.3%
1926
+0.09
0.3%
423
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.10
0.04
283
+0.09
0.01
690
+0.08
0.05
Negative Logits
McLaugh
-1.32
intersper
-1.26
Wtf
-1.14
encomp
-1.12
disambigu
-1.03
depic
-0.92
quitted
-0.92
apprehen
-0.92
McInt
-0.92
disagre
-0.91
POSITIVE LOGITS
tainment
2.00
tainer
1.27
TAINMENT
1.16
tainers
1.12
taining
1.06
tained
1.03
tain
1.00
Baillargeon
0.93
tains
0.90
tainable
0.82
Activations Density 0.372%