INDEX
Explanations
phrases indicating a sudden change or significant event happening
phrases indicating the concept of entirety or inclusion
New Auto-Interp
Negative Logits
nes
-0.72
lees
-0.71
eller
-0.67
gee
-0.67
ritic
-0.65
ombat
-0.64
eday
-0.64
obyl
-0.62
archives
-0.60
ainer
-0.59
POSITIVE LOGITS
sudden
1.01
us
0.96
eternity
0.83
them
0.82
these
0.81
humanity
0.80
those
0.75
humankind
0.73
ahu
0.71
Us
0.71
Activations Density 0.065%