INDEX
Explanations
references to napkins
references to naps and specific sports teams, particularly in a critical context
New Auto-Interp
Negative Logits
les
-0.80
water
-0.75
ishly
-0.72
ly
-0.72
Bron
-0.71
Keys
-0.71
ness
-0.70
Compan
-0.65
velt
-0.64
lying
-0.63
POSITIVE LOGITS
oleon
0.85
nesday
0.79
AAF
0.79
ĸļ
0.77
edia
0.77
mington
0.75
merce
0.72
CTR
0.71
plex
0.71
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
0.70
Activations Density 0.064%