INDEX
Explanations
proper nouns and phrases related to head-to-head comparisons
references to television shows and their elements
New Auto-Interp
Negative Logits
Riv
-0.82
PLA
-0.80
FANT
-0.77
vette
-0.77
iev
-0.72
Prot
-0.69
Els
-0.68
Dex
-0.67
VG
-0.66
Et
-0.65
POSITIVE LOGITS
Head
2.08
Head
2.04
Heads
2.04
head
1.97
head
1.93
HEAD
1.84
heads
1.84
heads
1.73
HEAD
1.72
tails
1.60
Activations Density 0.125%