INDEX
Explanations
discussive phrases introducing opinion or elaboration
New Auto-Interp
Negative Logits
]<<
-0.70
</caption>
-0.68
]<<"
-0.68
]]=
-0.68
prostu
-0.66
__":
-0.65
noqa
-0.63
Penh
-0.60
ÁB
-0.59
eseorang
-0.58
POSITIVE LOGITS
yes
0.91
why
0.87
hey
0.78
oh
0.77
yet
0.74
don
0.73
who
0.73
therein
0.72
then
0.70
yeah
0.70
Activations Density 0.135%