INDEX
Explanations
instances of negative scrutiny or criticism
phrases indicating increased scrutiny or criticism
New Auto-Interp
Negative Logits
sake
-0.68
Begins
-0.67
cures
-0.63
Potion
-0.62
dwarves
-0.62
caves
-0.60
Meridian
-0.58
Journals
-0.58
apes
-0.57
ents
-0.57
POSITIVE LOGITS
scrutiny
1.21
whel
0.89
criticism
0.89
suspicion
0.87
fire
0.85
pressure
0.83
attention
0.83
tremend
0.82
scanner
0.82
brunt
0.81
Activations Density 0.039%