INDEX
Explanations
short statements or headlines focusing on specific topics
specific phrases or concepts related to opinions or assessments
New Auto-Interp
Negative Logits
nor
-0.73
ades
-0.72
eurozone
-0.71
escape
-0.66
alysed
-0.66
nowhere
-0.65
intervened
-0.64
escaping
-0.62
ammed
-0.62
Buch
-0.60
POSITIVE LOGITS
Firstly
0.92
Overview
0.89
First
0.79
Appearance
0.78
Firstly
0.77
Name
0.77
ccording
0.76
Reason
0.76
Examples
0.75
Introduction
0.75
Activations Density 0.415%