INDEX
Explanations
phrases related to political news and commentary
punctuation or formatting elements in the text
New Auto-Interp
Negative Logits
enqu
-1.00
util
-0.92
colours
-0.87
humour
-0.87
stra
-0.82
dolphin
-0.82
compass
-0.81
xual
-0.80
commissions
-0.80
behaviours
-0.80
POSITIVE LOGITS
Story
1.67
RELATED
1.58
PHOTOS
1.57
Advertisement
1.56
Asked
1.56
Instead
1.54
ADVERTISEMENT
1.54
But
1.52
Related
1.49
Among
1.47
Activations Density 0.437%