INDEX
Explanations
contractions specifically with "I am" written as "I'm"
instances of self-identification or personal statements
New Auto-Interp
Negative Logits
ainer
-0.70
mater
-0.69
separates
-0.64
hinder
-0.62
looms
-0.61
occurs
-0.61
perish
-0.61
membr
-0.60
entails
-0.60
Tactics
-0.59
POSITIVE LOGITS
gonna
1.27
glad
0.95
thankful
0.93
fortunate
0.92
lucky
0.91
hoping
0.90
going
0.89
gotta
0.87
sorry
0.87
afraid
0.86
Activations Density 0.034%