INDEX
Explanations
instances where the word "it" is followed by another word or phrase
phrases that indicate uncertainty or conditionality
New Auto-Interp
Negative Logits
Et
-0.72
TT
-0.72
ands
-0.71
onement
-0.70
Ts
-0.69
IGN
-0.69
soDeliveryDate
-0.69
lette
-0.68
lets
-0.67
onds
-0.67
POSITIVE LOGITS
technically
1.03
disagree
0.99
admittedly
0.98
differed
0.96
disagreed
0.93
slight
0.92
occasional
0.87
initially
0.85
differs
0.84
concede
0.83
Activations Density 0.236%