INDEX
Explanations
text related to romance and relationships
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
479
+0.10
0.3%
130
+0.10
0.3%
900
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1105
+0.10
0.04
130
+0.10
0.03
1971
+0.09
0.02
Negative Logits
extendable
-0.60
NKC
-0.54
}\}
-0.53
]=-
-0.52
zeera
-0.51
TestMethod
-0.50
PhysRevD
-0.49
☰
-0.49
trattano
-0.49
Seeder
-0.49
POSITIVE LOGITS
apprehen
1.24
impra
1.22
gaily
1.22
McLaugh
1.20
vainly
1.18
unspeak
1.17
disagre
1.14
encomp
1.12
lovel
1.11
Shakspeare
1.11
Activations Density 0.178%