INDEX
Explanations
phrases describing groups of people
conjunctions, particularly the word "and."
New Auto-Interp
Negative Logits
itor
-0.72
Begins
-0.72
lear
-0.69
noon
-0.68
Compared
-0.65
uthor
-0.64
Firstly
-0.64
ieth
-0.63
umar
-0.63
agos
-0.62
POSITIVE LOGITS
etc
1.13
even
0.97
assorted
0.97
other
0.92
whatever
0.90
ultimately
0.88
otherwise
0.86
downright
0.84
etc
0.82
finally
0.81
Activations Density 0.189%