INDEX
Explanations
specific time indicators or dates in a sentence
the word "of" in various contexts
New Auto-Interp
Negative Logits
ertodd
-0.78
istries
-0.74
corrid
-0.69
asca
-0.67
aukee
-0.65
estyles
-0.65
ancest
-0.64
istg
-0.63
illac
-0.62
irez
-0.62
POSITIVE LOGITS
course
0.86
precaution
0.75
Course
0.69
ours
0.67
gery
0.67
ubi
0.66
sorts
0.64
truth
0.61
lined
0.61
liest
0.58
Activations Density 0.048%