INDEX
Explanations
references to written or composed content like letters, blog posts, novels, articles, or songs
references to writing and types of written works
New Auto-Interp
Negative Logits
ALE
-0.76
areth
-0.65
Ability
-0.65
ONSORED
-0.64
MSN
-0.64
granted
-0.61
FACE
-0.61
vantage
-0.61
seekers
-0.60
Location
-0.60
POSITIVE LOGITS
screenplay
1.08
memoir
1.03
autobiography
0.96
itatively
0.94
poem
0.91
blog
0.90
scathing
0.90
diary
0.87
poems
0.85
letter
0.83
Activations Density 0.166%