INDEX
Explanations
references to things being similar or alike
references to similar actions or statements across different contexts
New Auto-Interp
Negative Logits
accordingly
-0.63
maximum
-0.58
otta
-0.58
whatever
-0.57
whichever
-0.57
bart
-0.56
PLEASE
-0.56
#$
-0.55
CHAPTER
-0.54
preferably
-0.52
POSITIVE LOGITS
elsewhere
1.12
earlier
1.11
similar
0.98
similarly
0.97
previous
0.92
unsuccessfully
0.91
previously
0.82
unsuccessful
0.77
Earlier
0.73
other
0.73
Activations Density 0.515%