INDEX
Explanations
comparative phrases indicating something being not as good or not as strong as something else
comparative phrases that express ease or difficulty in relation to something
New Auto-Interp
Negative Logits
ACTIONS
-0.78
COUR
-0.68
olla
-0.66
DRAG
-0.65
ICA
-0.65
âĹ¼
-0.64
ardon
-0.63
NUM
-0.63
DragonMagazine
-0.62
Lic
-0.60
POSITIVE LOGITS
anymore
0.89
far
0.89
nor
0.85
well
0.84
imilar
0.83
much
0.83
vernment
0.81
evidenced
0.80
yet
0.79
part
0.78
Activations Density 0.073%