INDEX
Explanations
words related to newness or novelty
references to new beginnings or new entities
New Auto-Interp
Negative Logits
mination
-0.87
ashtra
-0.79
jar
-0.79
halla
-0.76
ostic
-0.74
xual
-0.74
Zip
-0.73
chnology
-0.72
rary
-0.72
Hayward
-0.71
POSITIVE LOGITS
bies
0.91
bie
0.89
arrivals
0.87
teammates
0.85
lease
0.84
batch
0.83
surroundings
0.82
acquaintances
0.82
venture
0.82
predicament
0.81
Activations Density 0.046%