INDEX
Explanations
mentions of "French" entities or contexts
references to France and French-related topics
New Auto-Interp
Negative Logits
iary
-0.89
iated
-0.82
Wan
-0.82
ication
-0.82
iates
-0.79
iating
-0.76
ript
-0.75
iations
-0.74
itar
-0.73
ifts
-0.72
POSITIVE LOGITS
Hollande
1.11
bourg
1.08
fries
1.06
ois
1.00
Alps
0.98
franc
0.98
satirical
0.95
Macron
0.90
Algeria
0.87
Hebdo
0.87
Activations Density 0.044%