INDEX
Explanations
conjunctions or words indicating causality between events
the word "so" and its variants, indicating a focus on conversational transitions or emphasis
New Auto-Interp
Negative Logits
Modes
-0.65
Newsletter
-0.60
Passage
-0.57
ansk
-0.56
®
-0.56
icipated
-0.55
degree
-0.55
weights
-0.54
sha
-0.53
Aust
-0.52
POSITIVE LOGITS
oooo
1.31
ooo
1.28
oner
1.17
yeah
1.11
bered
1.09
oooooooo
1.04
much
0.99
glad
0.98
damn
0.98
othe
0.97
Activations Density 0.075%