INDEX
Explanations
the word "suppose."
instances of the phrase "I suppose" or similar expressions indicating uncertainty or assumption
New Auto-Interp
Negative Logits
DOC
-0.79
wal
-0.73
assic
-0.70
atures
-0.68
ammy
-0.67
Tips
-0.65
haar
-0.63
ixt
-0.62
jab
-0.62
hor
-0.61
POSITIVE LOGITS
suppose
0.71
elf
0.66
Maggie
0.64
phantom
0.63
imagine
0.62
assum
0.62
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
0.62
Malfoy
0.61
ãĥ¼ãĥĨãĤ£
0.61
âĶĢâĶĢâĶĢâĶĢ
0.61
Activations Density 0.010%