INDEX
Explanations
phrases related to desserts and chocolate-based confections
New Auto-Interp
Negative Logits
<unused8>
-0.77
<unused43>
-0.76
<pad>
-0.76
[@BOS@]
-0.76
<unused41>
-0.76
<unused28>
-0.76
<unused42>
-0.76
<unused74>
-0.76
<unused16>
-0.76
<unused23>
-0.76
POSITIVE LOGITS
chocolate
1.17
Chocolate
1.06
Chocolate
1.05
chocolate
1.02
chocolates
0.81
OCOLATE
0.81
chocol
0.77
cocoa
0.77
шокола
0.77
cioccolato
0.76
Activations Density 0.266%