INDEX
Explanations
spoiler-related content within text
terms related to spoilers in narratives
New Auto-Interp
Negative Logits
ients
-0.71
des
-0.70
ald
-0.67
meters
-0.66
Ã
-0.66
cent
-0.66
iking
-0.65
kn
-0.65
calls
-0.65
initiatives
-0.65
POSITIVE LOGITS
spoiler
3.98
Spoiler
3.55
Spoiler
3.06
spoilers
2.64
oiler
2.58
OIL
2.22
spoil
1.53
disclaimer
1.42
teaser
1.41
Warning
1.39
Activations Density 0.045%