INDEX
Explanations
time-related contexts or actions
the term "ago" used in temporal contexts
New Auto-Interp
Negative Logits
suspic
-0.86
mathemat
-0.86
plaus
-0.80
tremend
-0.79
glim
-0.76
belie
-0.75
mobility
-0.75
lining
-0.74
challeng
-0.73
illumination
-0.71
POSITIVE LOGITS
vernment
1.42
zzi
1.08
xon
1.01
onga
0.97
zzo
0.94
etta
0.94
ago
0.90
edia
0.85
asca
0.84
allo
0.84
Activations Density 0.014%