INDEX
Explanations
conversational elements and greetings in the text
New Auto-Interp
Negative Logits
-*-č↵
-0.14
Homer
-0.14
âĶĺ
-0.14
izik
-0.13
esser
-0.13
romise
-0.13
oreal
-0.13
aise
-0.13
ently
-0.13
forgettable
-0.13
POSITIVE LOGITS
hello
0.73
Hello
0.70
Hello
0.62
hello
0.59
Hi
0.54
hi
0.53
_hello
0.50
Hi
0.49
greeting
0.48
greetings
0.46
Activations Density 0.283%