INDEX
Explanations
mentions of national-level entities or events
references to national issues or topics
New Auto-Interp
Negative Logits
omething
-0.92
xual
-0.92
enhagen
-0.80
eva
-0.79
YE
-0.77
hops
-0.74
MODE
-0.74
veyard
-0.71
tto
-0.71
herty
-0.71
POSITIVE LOGITS
ized
0.98
wide
0.93
anthem
0.88
izations
0.88
izing
0.87
ization
0.86
ities
0.85
Geographic
0.85
ised
0.85
ITY
0.83
Activations Density 0.037%