INDEX
Explanations
expressions related to critique or evaluation
phrases indicating superiority or excellence
New Auto-Interp
Negative Logits
Emergency
-0.68
pleted
-0.63
Located
-0.63
ospital
-0.59
icipated
-0.57
angering
-0.57
TBA
-0.56
displayText
-0.55
eligible
-0.55
mosqu
-0.55
POSITIVE LOGITS
analogy
1.02
Nietzsche
0.80
philosophers
0.80
paraph
0.78
theorists
0.75
Chomsky
0.75
admittedly
0.75
quote
0.75
irony
0.72
quote
0.71
Activations Density 0.960%