INDEX
Explanations
mentions of the Canadian political leader Justin Trudeau
New Auto-Interp
Negative Logits
ously
-0.78
sit
-0.67
inet
-0.63
à¥
-0.62
cider
-0.60
vernment
-0.59
hard
-0.58
lihood
-0.58
parole
-0.57
ãģį
-0.57
POSITIVE LOGITS
itory
0.84
fing
0.80
achable
0.77
uple
0.77
anism
0.77
ifact
0.74
pole
0.73
itions
0.73
onga
0.73
bral
0.73
Activations Density 2.745%