INDEX
Explanations
time-related phrases indicating the present or ongoing nature of certain situations
New Auto-Interp
Negative Logits
Consent
-0.75
sudden
-0.64
oops
-0.61
haste
-0.60
hens
-0.57
Intro
-0.56
kin
-0.56
Means
-0.56
Exit
-0.55
Fix
-0.55
POSITIVE LOGITS
haunt
0.87
descendants
0.83
proudly
0.81
haunted
0.75
reverber
0.74
adays
0.74
enshr
0.73
extant
0.72
regretted
0.72
etched
0.72
Activations Density 0.210%