INDEX
Explanations
phrases that indicate storytelling or the narration of experiences
New Auto-Interp
Negative Logits
avar
-0.15
byn
-0.14
527
-0.14
iem
-0.14
èĨľ
-0.14
è·Ŀ
-0.14
cplusplus
-0.14
ë¡ľëĤĺ
-0.14
ture
-0.14
.proto
-0.13
POSITIVE LOGITS
story
0.49
Story
0.40
Story
0.38
story
0.37
stories
0.36
tale
0.34
STORY
0.34
_story
0.33
.story
0.31
Stories
0.30
Activations Density 0.050%