INDEX
Explanations
phrases indicating a range or interval
instances of the word "between" indicating a range or interval
New Auto-Interp
Negative Logits
gow
-0.83
OGR
-0.73
matically
-0.72
pot
-0.72
matic
-0.68
unc
-0.68
omen
-0.66
olf
-0.65
istani
-0.63
awaru
-0.62
POSITIVE LOGITS
halves
1.12
extremes
0.96
sexes
0.95
rounds
0.95
genders
0.95
bouts
0.94
paragraphs
0.86
phases
0.84
worlds
0.83
layers
0.83
Activations Density 0.045%