INDEX
Explanations
statements where someone is conveying information
the word "that" in various contexts
New Auto-Interp
Negative Logits
ãĤ©
-0.78
greg
-0.71
EMBER
-0.71
ãĤ¼ãĤ¦ãĤ¹
-0.71
Ü
-0.70
åĤ
-0.69
scribe
-0.69
ãĥĺ
-0.68
Tank
-0.67
aceae
-0.65
POSITIVE LOGITS
although
1.36
despite
1.15
"[
1.14
while
1.09
whilst
0.98
unlike
0.94
whereas
0.92
unless
0.89
"â̦
0.84
since
0.83
Activations Density 0.196%