INDEX
Explanations
adverbs of degree and confirmation
New Auto-Interp
Negative Logits
1.07
1.01
0.97
0.97
0.95
0.94
0.92
0.88
0.87
inoltre
0.86
POSITIVE LOGITS
really
1.17
definitely
1.16
kinda
1.10
deliciously
1.06
actually
1.06
delightfully
1.04
tatsächlich
0.98
yeah
0.98
действительно
0.96
pretty
0.96
Activations Density 2.464%