INDEX
Explanations
references to political figures and their actions regarding international agreements
New Auto-Interp
Negative Logits
cplusplus
-0.16
elden
-0.15
Patch
-0.15
Patch
-0.15
cxx
-0.14
ttp
-0.14
vant
-0.14
orex
-0.14
laden
-0.14
íĺĢ
-0.13
POSITIVE LOGITS
CD
0.40
Bund
0.33
CD
0.33
SPD
0.31
spd
0.29
cd
0.27
Greens
0.26
CDs
0.26
Chancellor
0.24
chancellor
0.24
Activations Density 0.043%