INDEX
Explanations
phrases related to greetings and positive affirmations
the word "Good."
New Auto-Interp
Negative Logits
ptin
-0.79
eters
-0.78
hyde
-0.70
opers
-0.68
Tsukuyomi
-0.67
oths
-0.66
EStream
-0.66
ĸļ
-0.66
pora
-0.66
hod
-0.65
POSITIVE LOGITS
enough
1.27
reads
1.15
bye
1.15
Samar
1.07
luck
1.06
luck
1.03
intentions
0.99
sword
0.91
bye
0.91
deed
0.90
Activations Density 0.074%