INDEX
Explanations
phrases related to news reporting or quotes by individuals
sentiments and opinions related to experiences or evaluations
New Auto-Interp
Negative Logits
Whilst
-0.80
colourful
-0.74
summar
-0.73
unsurprisingly
-0.73
favourite
-0.73
Cardiff
-0.72
analysed
-0.72
BBC
-0.69
organise
-0.68
criticised
-0.68
POSITIVE LOGITS
..."
1.21
.''
1.09
.""
1.09
[/
1.05
----
1.04
-|
1.00
.�
0.97
,''
0.95
[/
0.95
"""
0.94
Activations Density 1.808%