INDEX
Explanations
phrases related to contrasting or delineating points
clauses or phrases that emphasize contrasts or contradictions
New Auto-Interp
Negative Logits
iffe
-0.67
eni
-0.67
IER
-0.65
VIS
-0.64
Became
-0.62
hurst
-0.61
aky
-0.61
USS
-0.60
RESULTS
-0.60
uman
-0.59
POSITIVE LOGITS
nor
1.78
nor
1.38
anymore
1.09
Nor
1.05
although
1.02
but
1.02
though
0.97
Rather
0.96
although
0.95
merely
0.94
Activations Density 0.147%