INDEX
Explanations
phrases with the structure "the first time"
references to events that are described as the "first time" something has occurred
New Auto-Interp
Negative Logits
ulk
-0.72
ribut
-0.69
atu
-0.69
ibaba
-0.68
omo
-0.65
pletion
-0.64
ffee
-0.63
ensibly
-0.63
hua
-0.62
eper
-0.62
POSITIVE LOGITS
culprit
0.88
instances
0.82
pmwiki
0.78
phenomenon
0.78
vex
0.75
shenan
0.73
pecul
0.72
accol
0.72
controversy
0.72
troubling
0.72
Activations Density 0.060%