INDEX
Explanations
repetitive uses of the verb "are"
New Auto-Interp
Negative Logits
matter
-0.91
oire
-0.80
terness
-0.77
dom
-0.73
ioch
-0.71
iliation
-0.69
arity
-0.68
ione
-0.68
cture
-0.67
ederation
-0.66
POSITIVE LOGITS
wolves
1.07
nods
0.85
wolf
0.80
types
0.79
vows
0.79
excerpts
0.78
weights
0.77
reminders
0.77
controls
0.77
eches
0.76
Activations Density 0.059%