INDEX
Explanations
recounts of dining scenarios, including restaurant visits and meal descriptions.
food appetite
New Auto-Interp
Negative Logits
itſelf
-1.13
Efq
-1.13
་་
-1.11
myſelf
-1.09
виправивши
-1.05
RUnlock
-1.05
存于互联网档案馆
-1.02
doubtnut
-1.01
Shakspeare
-1.01
Monfieur
-0.99
POSITIVE LOGITS
0.49
.
0.45
D
0.43
<eos>
0.42
↵↵
0.42
e
0.42
is
0.41
di
0.41
y
0.40
:
0.40
Activations Density 13.169%