INDEX
Explanations
spoilers in the text
references to spoilers in content
New Auto-Interp
Negative Logits
llan
-0.73
tis
-0.71
undai
-0.70
CV
-0.66
artney
-0.66
issance
-0.66
mu
-0.65
trak
-0.64
rys
-0.63
independent
-0.63
POSITIVE LOGITS
spoilers
1.12
spoiler
1.08
OIL
1.02
spoil
0.99
Ahead
0.93
SECTION
0.86
WARNING
0.85
ahead
0.84
Warning
0.83
Spoiler
0.82
Activations Density 0.057%