INDEX
Explanations
long sentences containing the word "wrote"
instances of the word "wrote."
New Auto-Interp
Negative Logits
ILCS
-0.80
Unity
-0.78
Ĭ±
-0.77
agara
-0.75
phant
-0.72
angular
-0.71
illon
-0.68
nel
-0.68
ICES
-0.66
allows
-0.65
POSITIVE LOGITS
letters
0.91
extensively
0.83
poems
0.81
eloqu
0.81
aloud
0.80
scathing
0.79
Letters
0.79
blog
0.79
sarcast
0.77
furiously
0.76
Activations Density 0.043%