INDEX
Explanations
phrases related to comparisons or contrasts
contextual references to specific individuals or entities
New Auto-Interp
Negative Logits
coerc
-0.69
."[
-0.68
".[
-0.63
withdraw
-0.62
withdrew
-0.60
withdrawing
-0.59
ertodd
-0.58
dumps
-0.57
).[
-0.57
contrace
-0.56
POSITIVE LOGITS
icion
0.92
understatement
0.70
geek
0.68
erning
0.65
nerd
0.64
intrigued
0.63
Fans
0.63
cringe
0.62
Legend
0.61
fans
0.61
Activations Density 1.087%