INDEX
Explanations
references to specific locations or individuals, particularly the name "Hague."
references to individuals and places related to political events
New Auto-Interp
Negative Logits
idity
-0.91
orescent
-0.72
istors
-0.71
forms
-0.70
uder
-0.70
isle
-0.69
icides
-0.68
idation
-0.68
cooks
-0.67
udeau
-0.67
POSITIVE LOGITS
vernment
0.83
Lans
0.81
ç·
0.73
emouth
0.71
cloth
0.70
RAFT
0.66
Herz
0.65
etz
0.65
enegger
0.65
HL
0.64
Activations Density 0.023%