INDEX
Explanations
comparisons of things as opposites
academic or philosophical terms related to contradiction or opposing views
New Auto-Interp
Negative Logits
corrid
-0.75
mathemat
-0.65
Nationwide
-0.65
stocking
-0.65
beetles
-0.64
cow
-0.64
killer
-0.63
gelatin
-0.62
congest
-0.61
sleeves
-0.61
POSITIVE LOGITS
xual
1.41
hesis
1.02
OPLE
1.00
heses
0.96
ynthesis
0.90
hran
0.89
chnology
0.88
ract
0.86
ctr
0.83
olkien
0.83
Activations Density 0.050%