INDEX
Explanations
phrases indicating speech or opinion
New Auto-Interp
Negative Logits
ILCS
-0.74
xtap
-0.66
odore
-0.65
brance
-0.64
pees
-0.64
Cop
-0.63
zens
-0.63
isable
-0.63
\/\/
-0.63
FILE
-0.62
POSITIVE LOGITS
bluntly
0.98
sarcast
0.92
:"
0.92
"[
0.89
goodbye
0.88
aloud
0.87
anecd
0.84
"â̦
0.83
:[
0.82
succinct
0.81
Activations Density 0.268%