INDEX
Explanations
phrases related to intentions or purposes
phrases indicating intentions behind actions or objects
New Auto-Interp
Negative Logits
uesday
-0.61
Solitaire
-0.59
Flavoring
-0.58
anon
-0.57
natureconservancy
-0.57
laws
-0.56
reports
-0.53
urus
-0.52
knots
-0.52
Videos
-0.51
POSITIVE LOGITS
to
0.97
primarily
0.86
solely
0.85
principally
0.81
Parenthood
0.76
purely
0.76
specifically
0.75
to
0.73
for
0.72
chiefly
0.71
Activations Density 0.051%