INDEX
Explanations
mentions of the act of writing
references to writing
New Auto-Interp
Negative Logits
EGA
-0.88
Ĭ±
-0.85
rolet
-0.83
ega
-0.79
rals
-0.73
abe
-0.73
azar
-0.71
agara
-0.71
illon
-0.67
Afee
-0.64
POSITIVE LOGITS
smanship
0.86
poems
0.82
writing
0.80
penned
0.78
notebook
0.77
writing
0.76
letters
0.73
poetry
0.73
essays
0.71
poem
0.71
Activations Density 0.026%