INDEX
Explanations
phrases related to general discussions or transitions
phrases indicating conditional scenarios or alternatives
New Auto-Interp
Negative Logits
intent
-0.72
necks
-0.71
usage
-0.70
mens
-0.68
repositories
-0.67
bent
-0.66
mers
-0.66
coded
-0.66
bots
-0.65
bian
-0.62
POSITIVE LOGITS
thereafter
0.81
concede
0.79
................................................................
0.76
except
0.72
suffice
0.72
hew
0.72
[/
0.71
else
0.71
concludes
0.71
|--
0.70
Activations Density 0.046%