INDEX
Explanations
contractions of the verb "to be" in sentences
the contraction "we're"
New Auto-Interp
Negative Logits
amer
-0.68
mater
-0.67
andise
-0.67
cycl
-0.67
odor
-0.63
pedia
-0.62
separates
-0.62
manifests
-0.60
shire
-0.60
icipated
-0.60
POSITIVE LOGITS
gonna
1.43
gotta
0.91
going
0.90
supposed
0.86
etsk
0.81
hoping
0.80
wolves
0.80
lucky
0.73
worth
0.72
okay
0.71
Activations Density 0.098%