INDEX
Explanations
dates and references to specific individuals or organizations mentioned in a formal context
instances of punctuation and phrases that indicate a list or sequence
New Auto-Interp
Negative Logits
ndra
-0.67
ãĤ¢ãĥ«
-0.61
peria
-0.59
resa
-0.59
aukee
-0.59
igers
-0.58
Score
-0.58
poke
-0.57
Fresh
-0.57
reperto
-0.57
POSITIVE LOGITS
incidentally
1.11
according
1.08
unsurprisingly
1.03
according
0.99
interestingly
0.98
ironically
0.98
moreover
0.96
unlike
0.93
alas
0.92
fortunately
0.91
Activations Density 0.078%