INDEX
Explanations
instances of the verb "to be" in various forms
New Auto-Interp
Negative Logits
Conſ
-0.73
myſelf
-0.72
pleaſure
-0.67
ſche
-0.67
ValueStyle
-0.66
itſelf
-0.65
ſever
-0.64
reaſon
-0.61
Perſ
-0.61
placement
-0.60
POSITIVE LOGITS
es
1.47
sp
0.91
ddelweddau
0.84
Qu
0.77
е
0.76
'},
0.74
'],
0.71
'}>
0.71
'):
0.70
"],
0.70
Activations Density 0.046%