INDEX
Explanations
phrases related to rules, regulations, and processes
terms related to character development in narratives
New Auto-Interp
Negative Logits
was
-0.67
remembers
-0.65
laughs
-0.65
Its
-0.64
loses
-0.63
sleeps
-0.63
prefers
-0.62
blames
-0.62
thinks
-0.62
Goes
-0.62
POSITIVE LOGITS
themselves
1.19
respectively
1.13
collectively
1.04
individually
0.94
respective
0.92
varying
0.82
constitute
0.82
geries
0.80
selves
0.79
together
0.79
Activations Density 0.862%