INDEX
Explanations
instances of the word "to" in various contexts
New Auto-Interp
Negative Logits
sauvages
-0.96
Monfieur
-0.94
énergé
-0.90
pleaſure
-0.90
scolaires
-0.89
industriels
-0.86
Theſe
-0.86
againſt
-0.85
Aplica
-0.85
Cuen
-0.85
POSITIVE LOGITS
be
0.99
have
0.86
actually
0.82
can
0.76
become
0.76
also
0.73
eventually
0.72
“
0.72
appear
0.71
could
0.70
Activations Density 0.132%