INDEX
Explanations
mentions of uncertainty and self-identification
the presence of the word "I'm" in various contexts
New Auto-Interp
Negative Logits
aneers
-0.71
Skydragon
-0.63
rouse
-0.61
Gap
-0.60
Ensure
-0.57
Bench
-0.57
Roh
-0.57
Pow
-0.57
Poc
-0.57
Labrador
-0.56
POSITIVE LOGITS
agine
1.18
gonna
0.98
umbai
0.95
ortal
0.95
selves
0.94
mediately
0.89
ighty
0.89
ael
0.88
initely
0.86
glad
0.79
Activations Density 0.031%