INDEX
Explanations
strong, emotionally charged language related to political or military discussions
New Auto-Interp
Negative Logits
['
-0.67
nell
-0.62
Canaveral
-0.61
æ©
-0.61
"},"
-0.59
stunts
-0.59
gull
-0.58
thriving
-0.58
valued
-0.58
thri
-0.57
POSITIVE LOGITS
summarize
1.20
recap
1.18
briefly
1.10
summar
1.00
disclaimer
0.97
spoiler
0.96
Introduction
0.96
endix
0.92
summarizes
0.89
reiterate
0.88
Activations Density 3.637%