INDEX
Explanations
phrases that hint at or prompt for a response from the reader
expressions that prompt agreement or validation in conversation
New Auto-Interp
Negative Logits
itiz
-0.81
»Ĵ
-0.76
inki
-0.69
ör
-0.68
hurst
-0.67
aum
-0.66
ido
-0.65
opens
-0.64
icum
-0.62
icon
-0.62
POSITIVE LOGITS
?'
0.83
?!
0.82
?'"
0.81
?).
0.77
?".
0.76
?ãĢį
0.74
?]
0.74
?
0.74
????
0.74
!?"
0.73
Activations Density 0.086%