INDEX
Explanations
phrases with the word "am"
the phrase "I am" followed by various personal statements or characteristics
New Auto-Interp
Negative Logits
externalToEVAOnly
-0.69
Gap
-0.68
occurs
-0.65
lasts
-0.65
entails
-0.65
Colleges
-0.64
hatch
-0.63
Tactics
-0.63
achieves
-0.62
fails
-0.62
POSITIVE LOGITS
glad
1.14
thankful
1.12
grateful
1.04
azon
0.98
fortunate
0.96
proud
0.94
honored
0.92
amazed
0.91
myself
0.90
delighted
0.90
Activations Density 0.096%