INDEX
Explanations
phrases indicating repetition or frequency
references to repeated actions or occurrences
New Auto-Interp
Negative Logits
Reviewer
-0.90
XT
-0.78
heirs
-0.77
ourt
-0.68
rats
-0.67
Marginal
-0.66
DIT
-0.65
andr
-0.62
etermination
-0.62
rera
-0.62
POSITIVE LOGITS
consecut
0.98
points
0.87
throughout
0.86
during
0.74
apiece
0.74
before
0.72
cale
0.71
lot
0.71
coded
0.71
orial
0.70
Activations Density 0.054%