INDEX
Explanations
phrases where something is being verbally added
instances of the word "added" and its variations
New Auto-Interp
Negative Logits
misc
-0.59
mathemat
-0.57
SPONSORED
-0.56
cies
-0.56
prototype
-0.53
isoft
-0.53
Cop
-0.51
metic
-0.50
asus
-0.50
Harbor
-0.50
POSITIVE LOGITS
sarcast
0.91
omin
0.88
that
0.82
insult
0.81
:
0.79
ictions
0.78
itional
0.78
itionally
0.74
:"
0.73
afterwards
0.72
Activations Density 0.062%