INDEX
Explanations
instances of uncertainty or possibility
instances of the word "possibly."
New Auto-Interp
Negative Logits
eries
-0.83
elson
-0.78
lins
-0.77
raint
-0.76
board
-0.76
raged
-0.74
oren
-0.74
oven
-0.74
yer
-0.74
rend
-0.74
POSITIVE LOGITS
someday
0.78
infer
0.78
mistaken
0.75
jeopard
0.74
misunder
0.74
contam
0.74
conclud
0.73
forgiven
0.71
interstitial
0.71
hint
0.71
Activations Density 0.018%