INDEX
Explanations
time durations
the word "until" in various contexts
New Auto-Interp
Negative Logits
pour
-0.71
certain
-0.69
hazard
-0.69
LAB
-0.67
venture
-0.66
aque
-0.65
puff
-0.65
cript
-0.65
founded
-0.64
cigarette
-0.64
POSITIVE LOGITS
itures
0.76
ithub
0.70
atcher
0.69
adulthood
0.69
proven
0.69
llular
0.66
opa
0.65
aments
0.65
terday
0.64
ishers
0.64
Activations Density 0.030%