INDEX
Explanations
the conjunction "but" used to introduce contrasting statements
New Auto-Interp
Negative Logits
learn
-0.59
(−
-0.59
miss
-0.58
き
-0.55
DEBUG
-0.54
hyper
-0.53
hof
-0.50
erk
-0.49
exit
-0.49
hom
-0.49
POSITIVE LOGITS
interestingly
0.70
secondly
0.68
luckily
0.67
hey
0.65
unfortunately
0.62
alas
0.61
importantly
0.59
sadly
0.59
thankfully
0.59
fortunately
0.56
Activations Density 0.087%