INDEX
Explanations
statements related to policies, regulations, actions, or states of being
phrases related to regulatory or governmental discussions
New Auto-Interp
Negative Logits
.–
-0.62
.</
-0.62
..."
-0.61
bury
-0.60
-"
-0.59
fil
-0.57
.","
-0.57
croft
-0.56
../
-0.55
Mine
-0.55
POSITIVE LOGITS
meanwhile
1.03
also
1.01
furthermore
0.99
moreover
0.98
therefore
0.93
however
0.88
certainly
0.88
undoubtedly
0.84
thus
0.78
doubtless
0.76
Activations Density 0.760%