INDEX
Explanations
contractions that include "are" followed by an adjective
occurrences of the contraction "we're."
New Auto-Interp
Negative Logits
Reduce
-0.72
mater
-0.70
ESE
-0.69
andise
-0.68
membr
-0.63
TAIN
-0.62
={-0.61
iates
-0.60
DS
-0.60
acters
-0.59
POSITIVE LOGITS
gonna
1.45
gotta
1.13
hoping
1.01
going
0.99
supposed
0.95
guessing
0.92
afraid
0.92
glad
0.91
got
0.90
sorry
0.88
Activations Density 0.068%