INDEX
Explanations
leaders, politicians, and government-related terms
references to notable companies, figures, and political events
New Auto-Interp
Negative Logits
Els
-0.79
é¾įå
-0.72
$.
-0.72
Translation
-0.71
Ò
-0.70
}.
-0.70
Sov
-0.69
ãĥ¼ãĥĨ
-0.68
ãģĻ
-0.67
ãĤ¯
-0.64
POSITIVE LOGITS
reacted
0.92
announced
0.89
awoke
0.86
unveiled
0.86
welcomes
0.85
reacts
0.82
has
0.81
celebrates
0.81
apologized
0.81
responded
0.79
Activations Density 0.613%