INDEX
Explanations
informal expressions indicating a sense of urgency or necessity
expressions of obligation or necessity
New Auto-Interp
Negative Logits
edly
-0.78
furt
-0.77
lies
-0.76
livious
-0.75
croft
-0.73
liness
-0.71
lund
-0.70
lier
-0.69
hips
-0.69
lique
-0.67
POSITIVE LOGITS
gotta
0.88
nab
0.81
buckle
0.78
avorite
0.74
wanna
0.74
ta
0.74
FINE
0.74
notch
0.73
stay
0.71
ditch
0.70
Activations Density 0.022%