INDEX
Explanations
phrases indicating the necessity or requirement to do something, such as "need to," "have to," or "had to."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1042
+0.10
0.3%
683
+0.10
0.3%
674
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
683
+0.10
0.07
1328
+0.10
0.06
411
+0.10
0.05
Negative Logits
intersper
-1.44
reluct
-1.44
emphat
-1.43
thut
-1.41
depic
-1.40
shenan
-1.40
maneu
-1.40
suspic
-1.39
vagu
-1.39
dises
-1.38
POSITIVE LOGITS
be
0.98
been
0.86
occur
0.77
<bos>
0.76
být
0.76
be
0.75
become
0.74
come
0.69
worden
0.68
被
0.67
Activations Density 0.480%