INDEX
Explanations
phrases indicating uncertainty or lack of clarity in a statement
New Auto-Interp
Negative Logits
fulness
-0.65
rence
-0.65
HAHAHAHA
-0.59
"""
-0.58
cellence
-0.57
olon
-0.56
die
-0.56
lite
-0.56
commits
-0.55
multi
-0.55
POSITIVE LOGITS
unclear
1.70
unknown
1.16
speculated
1.15
conceivable
1.12
believed
1.09
uncertain
1.07
doubtful
1.00
possible
0.95
estimated
0.94
unlikely
0.93
Activations Density 0.134%