INDEX
Explanations
phrases related to TV shows or programs
mentions of "show," indicating a focus on television or performance-related content
New Auto-Interp
Negative Logits
neutral
-0.74
goals
-0.66
membranes
-0.65
liter
-0.65
leveled
-0.61
etooth
-0.61
arily
-0.60
ends
-0.60
dec
-0.60
airborne
-0.60
POSITIVE LOGITS
show
4.11
shows
2.86
Show
2.18
Show
1.89
show
1.86
SHOW
1.78
Shows
1.57
display
1.35
episode
1.28
hide
1.26
Activations Density 0.008%