INDEX
Explanations
phrases related to ordering, coordination, and instructions
punctuation, particularly sentence-ending periods
New Auto-Interp
Negative Logits
instinct
-0.82
ŃĶ
-0.79
emic
-0.76
tangled
-0.75
buried
-0.73
footing
-0.73
bashing
-0.73
tackling
-0.71
shedding
-0.71
ugly
-0.71
POSITIVE LOGITS
Additionally
1.49
Alternatively
1.45
Therefore
1.38
Please
1.37
However
1.36
Otherwise
1.34
Depending
1.33
Furthermore
1.26
Also
1.24
Likewise
1.23
Activations Density 0.474%