INDEX
Explanations
punctuation and formatting markers in the text
portions of text that are formatted differently than the main body of the text
New Auto-Interp
Negative Logits
idth
-0.50
illet
-0.49
ucer
-0.49
poke
-0.47
angel
-0.46
Goth
-0.44
Celebration
-0.43
clad
-0.43
Crusade
-0.41
Birthday
-0.41
POSITIVE LOGITS
SPONSORED
0.67
diplom
0.61
CVE
0.56
Furthermore
0.54
Additionally
0.52
moreover
0.51
Moreover
0.49
fung
0.49
furthermore
0.49
additionally
0.49
Activations Density 2.434%