INDEX
Explanations
phrases that follow the word "Because"
causal relationships or explanations in a text
New Auto-Interp
Negative Logits
lem
-0.75
yan
-0.70
åĤ
-0.69
scr
-0.66
SPONSORED
-0.66
Gas
-0.66
ymph
-0.65
cloth
-0.63
shr
-0.63
agin
-0.63
POSITIVE LOGITS
rely
1.00
urers
0.83
akening
0.79
awaru
0.77
ertodd
0.71
uesday
0.68
imaru
0.68
ufficient
0.67
pite
0.67
pread
0.65
Activations Density 0.052%