INDEX
Explanations
words indicating a shift or change in a situation
phrases indicating changes in status or transformation over time
New Auto-Interp
Negative Logits
LOCK
-0.67
LLOW
-0.64
purse
-0.58
worthy
-0.57
MC
-0.56
deserve
-0.55
bodily
-0.55
Champ
-0.53
Logged
-0.53
Sik
-0.53
POSITIVE LOGITS
uddenly
1.01
suddenly
0.84
today
0.75
iott
0.74
apult
0.72
instead
0.69
emer
0.69
reversed
0.69
artney
0.68
nox
0.67
Activations Density 0.624%