INDEX
Explanations
adjectives describing things as clear and direct
terms related to simplicity and clarity
New Auto-Interp
Negative Logits
akin
-0.73
reen
-0.66
ingle
-0.66
wives
-0.65
Seeds
-0.63
aughters
-0.63
aden
-0.62
psey
-0.61
olina
-0.60
whales
-0.60
POSITIVE LOGITS
ly
1.18
straightforward
1.02
lly
0.98
LY
0.88
tons
0.81
itably
0.81
ity
0.81
é¾
0.80
\\\\\\\\
0.80
NESS
0.78
Activations Density 0.019%