INDEX
Explanations
phrases and words related to debunking myths and misconceptions
New Auto-Interp
Negative Logits
inger
-0.14
ird
-0.14
Dysfunction
-0.14
ãĥ¼ãĤº
-0.14
ìĿĮ
-0.14
ết
-0.14
cef
-0.13
foresee
-0.13
retrospect
-0.13
Wikispecies
-0.13
POSITIVE LOGITS
myth
0.60
myths
0.59
Myth
0.55
mythology
0.41
legend
0.37
falsehood
0.37
legends
0.36
false
0.35
mythical
0.34
Legend
0.33
Activations Density 0.363%