INDEX
Explanations
phrases related to decision-making and consequences
Text after periods and commas
pronouns and articles after punctuation
New Auto-Interp
Negative Logits
itſelf
-0.95
myſelf
-0.93
Monfieur
-0.89
purpoſe
-0.89
ſeveral
-0.84
pleaſure
-0.83
himſelf
-0.81
Theſe
-0.78
uſe
-0.76
EDEFAULT
-0.76
POSITIVE LOGITS
這位
0.88
这位
0.84
the
0.66
The
0.58
he
0.55
WebServlet
0.52
pemuda
0.49
pria
0.48
He
0.46
Ma
0.45
Activations Density 0.164%