INDEX
Explanations
the word 'is' and some following words such as articles, prepositions, and one- or two-letter words
existential affirmations or statements
New Auto-Interp
Negative Logits
Monfieur
-0.95
Theſe
-0.90
itſelf
-0.85
Efq
-0.84
myſelf
-0.82
themſelves
-0.81
pleaſure
-0.79
Jefus
-0.79
Beſ
-0.78
purpoſe
-0.76
POSITIVE LOGITS
difficult
0.79
not
0.78
a
0.76
important
0.75
possible
0.75
intended
0.69
easy
0.69
known
0.68
available
0.67
also
0.63
Activations Density 7.483%