INDEX
Explanations
phrases related to starting or beginning something
New Auto-Interp
Negative Logits
lua
-0.70
hof
-0.66
jong
-0.64
come
-0.63
arily
-0.60
jah
-0.59
ificent
-0.58
velop
-0.58
mo
-0.57
Ga
-0.56
POSITIVE LOGITS
residence
1.06
rights
0.89
positions
0.73
position
0.72
arms
0.70
sugg
0.68
stances
0.66
ibilities
0.65
IPM
0.64
stantial
0.64
Activations Density 0.023%