INDEX
Explanations
phrases that begin with the word "don’t" or variations of it
New Auto-Interp
Negative Logits
pus
-0.66
DRAGON
-0.64
EStreamFrame
-0.63
EStream
-0.62
ħĭ
-0.61
Species
-0.60
spoiled
-0.60
phal
-0.59
Featured
-0.59
milo
-0.59
POSITIVE LOGITS
't
1.54
ations
0.92
ned
0.91
ately
0.91
atives
0.89
ÃŃ
0.87
nel
0.87
ning
0.87
ovan
0.87
itely
0.86
Activations Density 0.023%