INDEX
Explanations
instances of the word "possibly" followed by other words
tentative language indicating possibility or uncertainty
New Auto-Interp
Negative Logits
ctions
-0.92
unes
-0.90
arthed
-0.80
ĸļ
-0.77
imet
-0.77
itute
-0.75
igers
-0.74
acas
-0.74
igger
-0.74
mire
-0.73
POSITIVE LOGITS
even
0.97
sooner
0.79
someday
0.73
others
0.71
unsus
0.70
overtake
0.69
worse
0.68
optionally
0.66
eliminate
0.65
possibly
0.64
Activations Density 0.087%