INDEX
Explanations
phrases that indicate an action or decision made by a spokesperson or authority figure
the word "that" in various contexts
New Auto-Interp
Negative Logits
ãĤ©
-0.80
ãĤ¼ãĤ¦ãĤ¹
-0.73
greg
-0.69
tein
-0.66
EMBER
-0.66
INFO
-0.66
Tank
-0.65
Ü
-0.65
ãĥĺ
-0.65
Thumbnail
-0.65
POSITIVE LOGITS
although
1.37
while
1.20
despite
1.18
"[
1.15
whilst
1.04
whereas
1.02
unlike
1.02
unless
1.00
"â̦
0.92
if
0.90
Activations Density 0.231%