INDEX
Explanations
phrases related to news events and incidents
occurrences of commas in the text
New Auto-Interp
Negative Logits
ocl
-0.82
ction
-0.76
estones
-0.69
onym
-0.69
cel
-0.67
matically
-0.66
itialized
-0.65
ety
-0.64
erv
-0.63
uten
-0.63
POSITIVE LOGITS
according
0.98
respectively
0.92
including
0.91
thereby
0.88
prompting
0.86
SPONSORED
0.83
namely
0.83
citing
0.83
albeit
0.82
ItemTracker
0.80
Activations Density 0.937%