INDEX
Explanations
studies or reports indicating trends or statistics
the phrase "that" indicating statements of factual findings or conclusions
New Auto-Interp
Negative Logits
ãĤ¤ãĥĪ
-0.71
arted
-0.68
atro
-0.68
iciary
-0.67
oses
-0.65
andem
-0.63
ETA
-0.62
ĪĴ
-0.62
Tank
-0.62
ready
-0.61
POSITIVE LOGITS
although
1.12
despite
1.00
whilst
0.86
there
0.82
"[
0.82
whereas
0.82
unsurprisingly
0.81
while
0.81
unlike
0.80
none
0.75
Activations Density 0.182%