INDEX
Explanations
words related to beginning or starting
references to beginnings or changes in trends or situations
New Auto-Interp
Negative Logits
wishes
-0.73
supervised
-0.71
cares
-0.71
loves
-0.69
careful
-0.68
lucky
-0.68
remembers
-0.67
spends
-0.65
SAL
-0.64
Doing
-0.62
POSITIVE LOGITS
dawn
1.28
snowball
1.20
dissip
1.09
perme
1.08
overshadow
1.07
become
1.05
evapor
1.04
dwind
1.04
coales
1.01
reverber
1.00
Activations Density 0.190%