INDEX
Explanations
names followed by descriptive suffixes
New Auto-Interp
Negative Logits
waivers
0.29
HIV
0.28
guar
0.28
Aztec
0.27
UHF
0.27
Congress
0.26
codec
0.26
AIDS
0.26
casein
0.26
Interpol
0.26
POSITIVE LOGITS
ari
0.36
ado
0.36
aj
0.35
4
0.33
bl
0.32
jan
0.32
ita
0.32
ando
0.31
ank
0.31
3
0.31
Activations Density 0.006%