INDEX
Explanations
statements that provide emphasis on the importance of certain information or actions
phrases emphasizing importance or significance
New Auto-Interp
Negative Logits
owe
-0.73
bryce
-0.70
rose
-0.65
uron
-0.65
ramids
-0.64
chairs
-0.64
boro
-0.63
alk
-0.63
alter
-0.62
anchester
-0.62
POSITIVE LOGITS
however
0.92
though
0.79
namely
0.75
moreover
0.75
insofar
0.68
besides
0.66
alas
0.65
albeit
0.64
ÃĤ
0.64
SPONSORED
0.63
Activations Density 0.172%