INDEX
Explanations
instances where the text talks about things being divided or split into categories
repeated use of the word "divided."
New Auto-Interp
Negative Logits
trak
-0.73
oken
-0.72
ORTS
-0.69
leeve
-0.66
jamin
-0.65
elin
-0.65
ãĥīãĥ©
-0.65
CHA
-0.64
rouse
-0.63
queue
-0.62
POSITIVE LOGITS
hairs
0.83
iple
0.78
evenly
0.77
SPL
0.76
rescent
0.74
naire
0.73
between
0.72
reth
0.71
unequ
0.71
alleg
0.71
Activations Density 0.022%