INDEX
Explanations
descriptions of scenes in an urban setting
sentences that end with a period
New Auto-Interp
Negative Logits
uly
-0.82
thal
-0.75
previously
-0.73
defe
-0.70
iber
-0.67
dimensional
-0.67
hostage
-0.65
tyr
-0.65
exclusively
-0.64
initially
-0.64
POSITIVE LOGITS
Occasionally
1.13
Visitors
1.10
Amid
1.06
Residents
1.05
Each
1.04
Others
1.04
Meanwhile
1.04
Worse
1.03
But
1.03
toggle
1.03
Activations Density 0.793%