INDEX
Explanations
stories or narratives
mentions of "tales" or narratives
New Auto-Interp
Negative Logits
ividual
-0.96
erate
-0.86
gettable
-0.79
ournal
-0.79
ussy
-0.76
lav
-0.75
edient
-0.75
itter
-0.74
inyl
-0.74
foreseen
-0.74
POSITIVE LOGITS
tales
1.32
tale
1.27
Tale
1.01
Tales
1.00
tale
0.91
tell
0.84
Reincarn
0.84
Ragnarok
0.80
fare
0.80
imagin
0.72
Activations Density 0.017%