INDEX
Explanations
phrases related to task completion or accomplishment
occurrences of the word "and."
New Auto-Interp
Negative Logits
rb
-0.80
meet
-0.77
fell
-0.73
hower
-0.72
oused
-0.71
omic
-0.70
ÃŃs
-0.70
uga
-0.67
opa
-0.66
byter
-0.65
POSITIVE LOGITS
etc
1.20
assorted
1.17
etc
0.99
blah
0.83
finally
0.81
possibly
0.78
ect
0.75
downright
0.73
perhaps
0.72
occasional
0.71
Activations Density 0.290%