INDEX
Explanations
language indicating a specific time or point in time
instances of a specific phrase or structure indicating a point in time
New Auto-Interp
Negative Logits
punishable
-0.59
favour
-0.58
itivity
-0.57
tools
-0.56
entit
-0.56
prosecut
-0.54
favor
-0.53
forth
-0.53
direction
-0.51
compl
-0.51
POSITIVE LOGITS
mosp
1.30
hens
1.27
least
1.20
kinson
1.09
letico
1.04
omic
0.99
stake
0.97
present
0.92
roc
0.88
ivan
0.84
Activations Density 0.069%