INDEX
Explanations
expressions indicating a strong level of certainty or emphasis
statements or phrases expressing certainty or confirmation
New Auto-Interp
Negative Logits
idas
-0.81
awaru
-0.81
entary
-0.79
ingly
-0.77
gencies
-0.71
ULAR
-0.69
agus
-0.69
ENCY
-0.66
ilaterally
-0.66
respectively
-0.65
POSITIVE LOGITS
qualifies
0.78
deserved
0.77
wasn
0.68
suited
0.67
weren
0.67
influenced
0.67
ought
0.67
enjoyed
0.66
wouldn
0.66
benefited
0.66
Activations Density 0.036%