INDEX
Explanations
phrases indicating a need for external validation or support
phrases that indicate the necessity of citations or sources
New Auto-Interp
Negative Logits
cipled
-0.96
edIn
-0.77
Monitor
-0.76
apesh
-0.75
Grid
-0.74
Pur
-0.73
Console
-0.72
FTWARE
-0.71
ahime
-0.71
Bench
-0.71
POSITIVE LOGITS
redacted
0.88
çīĪ
0.71
sentence
0.70
omitted
0.70
reserved
0.68
included
0.65
translator
0.65
ities
0.65
spoilers
0.65
ital
0.64
Activations Density 0.106%