INDEX
Explanations
transitions indicating a contrast or shift in focus
instances of the word "However."
New Auto-Interp
Negative Logits
Goth
-0.66
board
-0.63
cup
-0.62
stein
-0.61
uto
-0.59
award
-0.59
letter
-0.58
rieg
-0.57
parade
-0.57
speech
-0.56
POSITIVE LOGITS
chery
0.78
CLASSIFIED
0.78
tons
0.74
lers
0.74
interestingly
0.73
theless
0.72
chers
0.72
importantly
0.71
BIL
0.69
THER
0.67
Activations Density 0.035%