INDEX
Explanations
phrases indicating user actions or requests
after punctuation marks
apostrophes and periods in lists
New Auto-Interp
Negative Logits
autorytatywna
-0.84
MessageTagHelper
-0.82
myſelf
-0.77
surate
-0.76
rungsseite
-0.75
doubtnut
-0.75
WaitGroup
-0.72
Theſe
-0.68
raiſ
-0.67
poffible
-0.67
POSITIVE LOGITS
box
0.40
We
0.39
licht
0.38
box
0.37
The
0.36
(
0.36
GeneratedCode
0.35
чита
0.35
DeleteBehavior
0.35
T
0.35
Activations Density 0.034%