INDEX
Explanations
instances of phrases indicating a "first time" event
phrases indicating the occurrence of events for the first time
New Auto-Interp
Negative Logits
asty
-0.86
ote
-0.73
tomat
-0.66
bern
-0.66
Dri
-0.61
CHAT
-0.60
Mole
-0.60
asted
-0.60
ophe
-0.59
otten
-0.59
POSITIVE LOGITS
ever
0.84
EVER
0.78
imaginable
0.77
ever
0.75
domestically
0.69
imester
0.66
anniversary
0.64
Ever
0.63
practicable
0.62
since
0.62
Activations Density 0.027%