INDEX
Explanations
international organizations and diplomacy
New Auto-Interp
Negative Logits
Ethan
0.66
بی
0.64
Isopropyl
0.64
娇
0.63
烧
0.62
工业
0.60
儿子
0.60
𝑏
0.60
优质
0.59
商业
0.59
POSITIVE LOGITS
multilateral
1.19
treaties
1.07
diplomatic
1.06
treaty
1.02
international
1.00
diplomacy
1.00
international
0.99
diplomats
0.98
UNESCO
0.98
国際
0.97
Activations Density 0.061%