INDEX
Explanations
phrases or sentences ending with `"."`
terminal punctuation, particularly periods and other sentence-ending markers
New Auto-Interp
Negative Logits
starved
-0.72
consec
-0.66
wardrobe
-0.65
fairy
-0.64
thrill
-0.63
lifes
-0.63
habit
-0.63
uncond
-0.62
ninja
-0.62
scenery
-0.62
POSITIVE LOGITS
Adds
0.86
Additionally
0.84
Towards
0.83
Presumably
0.82
Needless
0.81
According
0.81
Furthermore
0.81
According
0.80
Moreover
0.80
Refer
0.79
Activations Density 0.097%