INDEX
Explanations
instances with the word "likely" and associated verbs that indicate probability or speculation
conditional phrases and negations regarding likelihood or possibility
New Auto-Interp
Negative Logits
purportedly
-0.84
supposedly
-0.81
ĸļ
-0.79
allegedly
-0.76
apparently
-0.75
evidently
-0.72
andise
-0.66
igers
-0.66
clearly
-0.65
reportedly
-0.63
POSITIVE LOGITS
misunder
0.73
rist
0.72
depending
0.71
rosso
0.69
underestimated
0.68
attering
0.67
unnoticed
0.66
heading
0.66
ulp
0.66
hin
0.66
Activations Density 0.335%