INDEX
Explanations
names and titles related to political figures and government positions
specific references to online or social media interactions
New Auto-Interp
Negative Logits
}.
-0.71
".
-0.66
?".
-0.63
};
-0.62
''.
-0.61
SPONSORED
-0.60
.).
-0.60
});
-0.60
VIDIA
-0.60
).
-0.59
POSITIVE LOGITS
extensively
0.69
differently
0.68
squarely
0.61
aback
0.60
's
0.59
via
0.59
bandwagon
0.58
cautiously
0.57
separately
0.57
dilemma
0.56
Activations Density 1.000%