INDEX
Explanations
adjectives that express strong emotions like surprise or concern
terms expressing surprise or concern
New Auto-Interp
Negative Logits
resp
-0.72
agine
-0.66
elf
-0.65
ournal
-0.65
vous
-0.65
ighth
-0.61
aper
-0.60
cipl
-0.59
bey
-0.59
ravings
-0.58
POSITIVE LOGITS
enough
1.07
LY
0.98
nonetheless
0.93
ly
0.86
ingly
0.82
because
0.77
considering
0.75
insofar
0.75
JPM
0.73
200000
0.69
Activations Density 0.184%