INDEX
Explanations
vocabulary related to frames or framing
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
1052
+0.14
0.8%
1777
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1052
+0.17
0.03
1777
+0.14
0.03
1604
+0.12
0.03
Negative Logits
<bos>
-3.03
ⓧ
-0.77
/***
-0.70
inaugurate
-0.69
endow
-0.62
<?
-0.61
-0.61
MarshalTo
-0.61
endeavored
-0.59
rehabilitate
-0.59
POSITIVE LOGITS
Frame
1.28
frame
1.27
Frames
1.27
frames
1.25
saar
1.17
frame
1.16
Frame
1.15
thuy
1.14
kark
1.11
FRAME
1.11
Activations Density 0.069%