INDEX
Explanations
present tense forms of the verb 'to be.'
New Auto-Interp
Negative Logits
ESE
-0.73
Reduce
-0.73
membr
-0.70
mater
-0.67
TAIN
-0.67
DS
-0.63
andise
-0.63
acters
-0.59
occurs
-0.59
denotes
-0.59
POSITIVE LOGITS
gonna
1.52
gotta
1.15
going
1.04
hoping
1.03
glad
1.00
sorry
0.99
afraid
0.96
guessing
0.96
supposed
0.94
got
0.93
Activations Density 0.052%