INDEX
Explanations
sequences of words that describe steps, instructions, or rules
instances of the phrase "as follows."
New Auto-Interp
Negative Logits
olla
-0.80
oll
-0.78
orc
-0.77
ricular
-0.70
owered
-0.69
oard
-0.69
earthqu
-0.66
vere
-0.66
olin
-0.65
anyon
-0.64
POSITIVE LOGITS
follows
1.16
bourg
0.86
follow
0.85
followed
0.82
follow
0.81
Follow
0.79
ĸļ
0.78
pursu
0.72
susp
0.71
¿½
0.66
Activations Density 0.006%