INDEX
Explanations
references to the passage of time, particularly the phrase "ago."
New Auto-Interp
Negative Logits
ScreenState
-0.18
annie
-0.17
moz
-0.15
enant
-0.15
andan
-0.15
ripp
-0.15
erner
-0.14
uka
-0.14
iced
-0.14
aturity
-0.14
POSITIVE LOGITS
Abrams
0.15
arges
0.15
CHA
0.14
áu
0.14
fts
0.14
ody
0.14
chwitz
0.14
/by
0.14
Ïĩη
0.14
ngth
0.14
Activations Density 0.017%