INDEX
Explanations
negations and conditional phrases
Follows "to" or a single-letter token
verbs of being or occurrence
New Auto-Interp
Negative Logits
Majefty
-0.81
Portail
-0.74
myſelf
-0.73
chofe
-0.72
ſelf
-0.70
leaſt
-0.70
ſind
-0.69
pleaſure
-0.69
uſe
-0.69
onAnimation
-0.69
POSITIVE LOGITS
consist
0.81
occur
0.79
entail
0.68
serve
0.67
happen
0.65
contain
0.64
ocur
0.64
constitute
0.64
arise
0.63
comprise
0.62
Activations Density 0.865%