INDEX
Explanations
adverbs and their forms, particularly those that convey manner or degree
New Auto-Interp
Negative Logits
-0.17
pu
-0.16
sg
-0.15
sto
-0.15
cu
-0.15
iful
-0.15
uai
-0.14
quil
-0.14
824
-0.14
tron
-0.14
POSITIVE LOGITS
referrer
0.16
esch
0.15
tics
0.15
uger
0.15
Speaking
0.14
ãĤ·ãĤ¢
0.14
éric
0.14
spe
0.14
اÙĦاع
0.14
ktion
0.14
Activations Density 0.456%