INDEX
Explanations
references to political debates and their formats
New Auto-Interp
Negative Logits
gue
-0.17
Tent
-0.15
gener
-0.14
ziel
-0.14
ãĥ³ãĥĶ
-0.14
awa
-0.14
ffen
-0.14
/auto
-0.13
wet
-0.13
Gro
-0.13
POSITIVE LOGITS
akes
0.16
ehr
0.15
884
0.14
quete
0.14
harma
0.14
rens
0.14
hec
0.14
ayar
0.14
trl
0.13
064
0.13
Activations Density 0.020%