INDEX
Explanations
phrases that indicate future events or imminent releases
New Auto-Interp
Negative Logits
abyrin
-0.19
abyrinth
-0.16
riz
-0.15
inho
-0.15
eses
-0.14
ÑĥÑĪка
-0.14
/DD
-0.14
ongoing
-0.14
ized
-0.14
existing
-0.14
POSITIVE LOGITS
/current
0.27
/up
0.18
/new
0.17
-generation
0.17
lassen
0.16
ling
0.15
generations
0.15
retirees
0.15
/original
0.15
zer
0.15
Activations Density 0.018%