INDEX
Explanations
phrases related to statements or positions on particular topics
verbs indicating advocacy, defense, or emotional response in discussions
New Auto-Interp
Negative Logits
Ther
-0.76
wayne
-0.68
ILCS
-0.66
window
-0.64
opus
-0.64
brance
-0.63
scribe
-0.62
Base
-0.61
Contents
-0.61
ãĥİ
-0.61
POSITIVE LOGITS
sarcast
0.82
anke
0.74
Wednesday
0.74
Thursday
0.74
Tuesday
0.73
dismiss
0.68
Monday
0.67
upbeat
0.66
yesterday
0.66
brisk
0.66
Activations Density 0.222%