INDEX
Explanations
phrases indicating emphasis or importance
statements that express significant observations or insights
New Auto-Interp
Negative Logits
endars
-0.71
ebook
-0.69
swick
-0.67
MpServer
-0.66
feces
-0.62
ihu
-0.60
eater
-0.60
acements
-0.60
scrub
-0.57
etimes
-0.56
POSITIVE LOGITS
caveat
0.87
misconception
0.87
Quote
0.85
mantra
0.80
assumption
0.78
question
0.78
fallacy
0.77
oft
0.77
reason
0.77
recurring
0.76
Activations Density 0.804%