INDEX
Explanations
references to missed opportunities or alternative choices
phrases indicating potentiality or hypothetical situations
New Auto-Interp
Negative Logits
ilty
-0.73
plex
-0.65
utor
-0.59
gencies
-0.57
ele
-0.57
Blitz
-0.56
istent
-0.56
ving
-0.54
bert
-0.53
Kis
-0.53
POSITIVE LOGITS
feas
1.06
easily
0.99
ivably
0.99
conce
0.92
ħĭ
0.86
possibly
0.84
nesota
0.77
potentially
0.77
alternatively
0.73
heard
0.72
Activations Density 0.139%