INDEX
Explanations
occurrences of the word "asked" and its variants in various contexts
New Auto-Interp
Negative Logits
Ĥ¬
-0.70
Fit
-0.68
herence
-0.66
Est
-0.63
ources
-0.62
mars
-0.61
Siber
-0.60
Confeder
-0.59
rats
-0.59
Inc
-0.57
POSITIVE LOGITS
questions
1.11
repeatedly
0.90
question
0.88
rhet
0.86
naires
0.86
asked
0.84
naire
0.82
probing
0.81
forgiveness
0.80
Questions
0.78
Activations Density 0.018%