INDEX
Explanations
mentions of specific names, particularly the name "Boris"
references to Boris Johnson and his political context
New Auto-Interp
Negative Logits
imil
-0.76
lessly
-0.72
uate
-0.68
ening
-0.68
imony
-0.67
tenance
-0.66
inez
-0.64
lessness
-0.63
oning
-0.62
ily
-0.61
POSITIVE LOGITS
cair
0.81
enty
0.80
ensibly
0.78
adier
0.76
bledon
0.75
rer
0.74
ovich
0.73
cot
0.72
Soup
0.72
arettes
0.72
Activations Density 0.125%