INDEX
Explanations
phrases that express making things easier or better
New Auto-Interp
Negative Logits
myſelf
-0.94
houſe
-0.94
Efq
-0.92
whoſe
-0.92
pleaſure
-0.91
utafitiHapana
-0.88
ſhe
-0.86
Majefty
-0.86
againſt
-0.86
raiſ
-0.85
POSITIVE LOGITS
make
1.09
Make
1.00
makes
0.94
MAKE
0.92
make
0.92
Make
0.91
made
0.91
Makes
0.84
MAKES
0.81
making
0.81
Activations Density 0.153%