INDEX
Explanations
phrases related to intention, purpose or strategy
phrases indicating methods or ways of achieving objectives or making a point
New Auto-Interp
Negative Logits
tis
-1.13
*/
-0.80
Indigo
-0.79
semble
-0.74
stars
-0.74
omes
-0.70
Elements
-0.69
Tos
-0.68
TJ
-0.67
modules
-0.67
POSITIVE LOGITS
pretext
1.25
excuse
1.12
blackmail
1.11
ploy
1.06
justify
0.98
circumvent
0.98
leverage
0.96
justifying
0.95
distract
0.92
justification
0.91
Activations Density 0.563%