INDEX
Explanations
short phrases related to crime and legal actions
phrases containing numerical or statistical references
New Auto-Interp
Negative Logits
ocl
-0.77
lihood
-0.74
istical
-0.68
Newsletter
-0.67
ideon
-0.66
sburg
-0.65
enth
-0.65
itis
-0.64
zek
-0.64
conom
-0.64
POSITIVE LOGITS
prompting
0.92
according
0.88
sparking
0.83
CNBC
0.81
citing
0.79
despite
0.75
claiming
0.74
forcing
0.74
aka
0.73
including
0.73
Activations Density 0.322%