INDEX
Explanations
names and terms related to politics and government
New Auto-Interp
Negative Logits
rg
-0.73
ichick
-0.69
eal
-0.64
ãĥĥ
-0.63
packing
-0.62
Nicaragua
-0.61
aul
-0.61
ften
-0.61
FontSize
-0.61
oko
-0.60
POSITIVE LOGITS
D
2.34
D
1.91
d
1.53
Ds
1.46
Dum
1.35
d
1.33
DG
1.26
DAC
1.26
DV
1.22
DS
1.22
Activations Density 0.589%