INDEX
Explanations
sentences ending in '.'
phrases or references related to significant events or changes
New Auto-Interp
Negative Logits
someday
-0.70
discourage
-0.65
pard
-0.64
hope
-0.63
welcomes
-0.63
morrow
-0.63
urge
-0.62
foreigners
-0.62
nicer
-0.60
wills
-0.60
POSITIVE LOGITS
âĢ
1.40
âĸ
1.18
Upon
1.02
Within
1.00
ãĢ
1.00
̶
0.99
âĢ
0.95
âĹ
0.94
Initially
0.92
ccording
0.92
Activations Density 0.628%