INDEX
Explanations
words related to newness or recent events
New Auto-Interp
Negative Logits
eward
-0.16
destin
-0.15
venture
-0.15
lacak
-0.15
etin
-0.15
reach
-0.15
enef
-0.14
-notch
-0.14
ouz
-0.14
sooner
-0.13
POSITIVE LOGITS
mint
0.33
mint
0.24
wed
0.24
Mint
0.20
formed
0.20
arrived
0.19
created
0.19
christ
0.18
newly
0.18
-formed
0.18
Activations Density 0.014%