INDEX
Explanations
emotional and personal narrative content, particularly focusing on loss and deep relationships
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.15
0.5%
468
+0.14
0.4%
1937
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
972
+0.15
0.06
1937
+0.14
0.07
468
+0.10
0.06
Negative Logits
İl
-0.46
Ī
-0.45
Therefore
-0.45
الاطلاع
-0.45
Hence
-0.44
pushd
-0.43
Std
-0.43
However
-0.42
FontOfSize
-0.42
Hence
-0.41
POSITIVE LOGITS
affez
0.82
dags
0.76
germain
0.76
tyme
0.75
reft
0.74
vivace
0.72
naer
0.72
marrone
0.70
eccellente
0.69
rispond
0.68
Activations Density 0.419%