INDEX
Explanations
topics or subjects mentioned in the text
phrases that indicate a subject of discussion or debate
New Auto-Interp
Negative Logits
yy
-0.71
onse
-0.64
nephew
-0.64
Americas
-0.61
arat
-0.60
Mex
-0.60
behaved
-0.57
ok
-0.57
prevailed
-0.56
hes
-0.56
POSITIVE LOGITS
ridicule
1.05
ENTION
0.96
ire
0.95
scorn
0.89
spection
0.87
attention
0.82
controversy
0.74
scrutiny
0.71
criticism
0.71
jokes
0.69
Activations Density 0.160%