INDEX
Explanations
phrases that express conditional expectations or hypotheses
New Auto-Interp
Negative Logits
itſelf
-1.13
Efq
-1.13
Majefty
-1.12
ſelf
-1.05
ſelves
-1.03
houſe
-0.98
Theſe
-0.97
Reſ
-0.96
Houſe
-0.96
himſelf
-0.93
POSITIVE LOGITS
would
1.17
Would
1.00
Would
0.97
would
0.94
WOULD
0.84
d
0.80
Id
0.67
gustaría
0.65
id
0.65
würde
0.65
Activations Density 0.091%