INDEX
Explanations
phrases related to speech or quotes
references to speech and statements made by individuals
New Auto-Interp
Negative Logits
shoot
-0.73
onut
-0.69
equipped
-0.67
xtap
-0.67
amate
-0.64
Claw
-0.64
ticket
-0.62
pend
-0.61
fullest
-0.61
ãĥ¯ãĥ³
-0.60
POSITIVE LOGITS
goodbye
1.36
aloud
1.33
about
1.19
ABOUT
0.95
loudly
0.91
about
0.88
publicly
0.87
Goodbye
0.85
hello
0.84
afterwards
0.83
Activations Density 0.048%