INDEX
Explanations
time-related words or phrases
phrases indicating the passage of time
New Auto-Interp
Negative Logits
anium
-0.80
ãĥİ
-0.73
udeb
-0.72
acceptable
-0.71
conservancy
-0.71
ums
-0.70
arching
-0.70
cheat
-0.67
ĸļ
-0.66
particularly
-0.64
POSITIVE LOGITS
,
0.97
he
0.81
they
0.79
however
0.77
she
0.74
thereafter
0.72
another
0.72
we
0.69
,.
0.69
someone
0.69
Activations Density 0.226%