INDEX
Explanations
words related to counterpoints or contrasting views followed by a strong stance or argument
the end of text tokens
New Auto-Interp
Negative Logits
agra
-0.76
itto
-0.69
kie
-0.64
oufl
-0.64
chnology
-0.60
analysis
-0.60
vein
-0.59
idon
-0.59
tnc
-0.59
rongh
-0.58
POSITIVE LOGITS
tons
1.09
nevertheless
0.88
alas
0.87
nonetheless
0.87
chery
0.84
fortunately
0.79
chers
0.74
luckily
0.73
hey
0.69
GHC
0.68
Activations Density 0.099%