INDEX
Explanations
sentences and phrases referencing artistic works, such as movies or books, emphasizing their titles and emotional impact
New Auto-Interp
Negative Logits
-0.67
<bos>
-0.65
.
-0.55
-
-0.53
↵
-0.50
↵↵
-0.49
:
-0.48
;
-0.48
<eos>
-0.47
of
-0.46
POSITIVE LOGITS
ſelf
0.93
^(@)
0.92
".
0.91
вгений
0.90
otomatig
0.89
addCriterion
0.89
doubtnut
0.87
Majefty
0.86
ſelves
0.85
Meksiku
0.84
Activations Density 0.541%