INDEX
Explanations
references to the word "one" and its various forms
Follows the word "one"
New Auto-Interp
Negative Logits
Jefus
-0.89
ſch
-0.86
Majefty
-0.85
ſche
-0.85
―――――
-0.84
myſelf
-0.80
Roskov
-0.79
pleaſure
-0.79
occaf
-0.78
raiſ
-0.78
POSITIVE LOGITS
thing
0.84
hundred
0.80
person
0.74
of
0.68
sided
0.68
third
0.67
onta
0.66
particular
0.65
big
0.65
another
0.64
Activations Density 0.182%