INDEX
Explanations
phrases that emphasize the concept of "state of the art"
New Auto-Interp
Negative Logits
"
-1.10
“
-0.96
'],'
-0.78
'
-0.70
is
-0.70
I
-0.69
-0.68
']").
-0.67
+
-0.65
”
-0.64
POSITIVE LOGITS
Efq
1.20
Cæsar
1.12
itſelf
1.11
Shakspeare
1.06
Diſ
1.02
Monfieur
1.01
myſelf
1.01
houſe
1.00
Reſ
0.99
pleaſure
0.98
Activations Density 0.096%