INDEX
Explanations
phrases containing conjunctions, often 'and'
occurrences of the word "and" indicating connections or relationships between ideas
New Auto-Interp
Negative Logits
ummer
-0.80
atan
-0.74
utor
-0.71
qqa
-0.69
anmar
-0.67
EMS
-0.65
uto
-0.65
BAT
-0.64
umo
-0.64
qu
-0.63
POSITIVE LOGITS
wondered
1.08
wished
0.98
urge
0.96
knew
0.93
wants
0.89
urges
0.88
knows
0.88
believes
0.86
wanted
0.85
wishes
0.83
Activations Density 0.315%