INDEX
Explanations
instances of the phrase "one of" followed by a number
phrases that repeatedly reference "one of" followed by a list or category
New Auto-Interp
Negative Logits
agre
-0.67
urated
-0.67
rina
-0.66
ende
-0.66
surpr
-0.63
culosis
-0.63
chart
-0.61
condem
-0.61
allery
-0.60
ĸļ
-0.60
POSITIVE LOGITS
ses
0.78
them
0.77
theirs
0.76
icial
0.70
course
0.67
course
0.64
them
0.64
whom
0.62
THEM
0.62
us
0.62
Activations Density 0.102%