INDEX
Explanations
the preposition "on" and its contextual use
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
321
+0.14
0.8%
73
+0.12
0.7%
444
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
108
+0.14
0.00
70
+0.12
0.00
182
+0.12
0.00
Negative Logits
¿½
-3.24
ĸ
-3.06
³
-3.06
ĭ
-3.05
↵
-3.02
↵
-3.02
↵
-3.02
↵
-3.02
-3.02
<|outofrange|>
-3.02
POSITIVE LOGITS
Jr
1.46
IVATE
1.40
plead
1.39
toward
1.39
"}](#
1.34
reluctantly
1.32
without
1.30
calling
1.29
Wiley
1.29
hormone
1.28
Activations Density 0.000%