INDEX
Explanations
conversational prompts and inquiries
New Auto-Interp
Negative Logits
oga
-0.15
instr
-0.14
lah
-0.14
iyah
-0.14
antal
-0.14
ace
-0.14
angu
-0.14
aha
-0.14
ra
-0.14
Table
-0.13
POSITIVE LOGITS
opinions
0.30
opinion
0.28
opin
0.28
Thoughts
0.26
thoughts
0.26
Opinion
0.23
comments
0.21
feedback
0.20
reactions
0.20
acomment
0.20
Activations Density 0.139%