INDEX
Explanations
proper nouns, specifically names, and locations
references to specific individuals named Pierre
New Auto-Interp
Negative Logits
agy
-0.83
iband
-0.76
ighting
-0.76
ritten
-0.75
orld
-0.75
ichael
-0.74
astered
-0.74
awar
-0.74
izen
-0.73
ograp
-0.73
POSITIVE LOGITS
Trudeau
0.92
rall
0.80
Ô
0.79
ued
0.75
Ò
0.73
loo
0.72
lez
0.70
Santana
0.69
xual
0.69
Laurent
0.68
Activations Density 0.095%