INDEX
Explanations
references to myths or misconceptions
references to myths and misconceptions
New Auto-Interp
Negative Logits
perature
-0.77
foreseen
-0.74
rared
-0.73
affer
-0.70
arnaev
-0.70
bern
-0.70
imentary
-0.69
emouth
-0.68
ktop
-0.68
Interstitial
-0.67
POSITIVE LOGITS
Myth
1.15
icist
1.08
Myth
0.98
myth
0.93
ril
0.92
olog
0.84
myths
0.82
lore
0.81
ic
0.77
Reincarn
0.75
Activations Density 0.016%