INDEX
Explanations
specifications or details within a text
transitional phrases and punctuation that indicate a change in thought or clarification
New Auto-Interp
Negative Logits
waged
-0.63
untarily
-0.60
toll
-0.58
neighb
-0.58
designated
-0.58
elve
-0.57
ogram
-0.56
angering
-0.56
ixty
-0.56
surrendered
-0.56
POSITIVE LOGITS
Quote
1.08
Firstly
1.05
Anyway
0.96
Basically
0.95
yeah
0.93
Firstly
0.91
why
0.90
EDIT
0.88
Basically
0.87
yeah
0.85
Activations Density 0.726%