INDEX
Explanations
phrases expressing importance or significance
phrases that emphasize significant events or situations
New Auto-Interp
Negative Logits
xit
-1.01
raped
-0.79
ILCS
-0.69
via
-0.68
cheat
-0.67
ritch
-0.66
apons
-0.64
amber
-0.63
tein
-0.63
hibited
-0.62
POSITIVE LOGITS
contender
0.91
(>
0.68
coincidence
0.68
Bust
0.66
difference
0.65
misconception
0.65
hitters
0.64
understatement
0.63
happening
0.63
Sense
0.62
Activations Density 0.080%