INDEX
Explanations
phrases related to software releases and updates
sentences that end with a period
New Auto-Interp
Negative Logits
arsen
-0.74
',"
-0.73
ibur
-0.69
poisoning
-0.68
pse
-0.67
boycot
-0.67
affili
-0.67
challeng
-0.66
stagn
-0.66
'."
-0.65
POSITIVE LOGITS
Includes
1.41
Contains
1.26
Allows
1.22
Cannot
1.17
jpg
1.15
Including
1.12
Comes
1.11
Retrieved
1.11
1.08
Seems
1.08
Activations Density 0.317%