INDEX
Explanations
instances of the infinitive form of verbs, particularly "to" followed by verb phrases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.52
2.3%
1334
+0.10
0.5%
1967
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1415
+0.52
0.03
1334
+0.10
0.03
1425
+0.10
0.02
Negative Logits
intersper
-1.34
<bos>
-1.01
rouse
-0.88
overcrow
-0.87
endow
-0.76
depic
-0.72
banish
-0.70
gratify
-0.69
disambigu
-0.69
koz
-0.68
POSITIVE LOGITS
pymongo
0.84
smtplib
0.78
heapq
0.73
pymysql
0.73
demas
0.71
ausp
0.68
بسم
0.67
barbacoa
0.67
SneakyThrows
0.66
==""){0.65
Activations Density 0.083%