INDEX
Explanations
sentences or phrases ending in a period
New Auto-Interp
Negative Logits
advis
-0.76
advoc
-0.75
activ
-0.74
sal
-0.72
docks
-0.68
rall
-0.67
tyr
-0.67
aw
-0.67
comm
-0.66
overdue
-0.64
POSITIVE LOGITS
Instead
1.92
Nor
1.71
Rather
1.61
Neither
1.53
Instead
1.52
Nonetheless
1.37
Nevertheless
1.36
nor
1.35
Nor
1.31
Quite
1.28
Activations Density 0.475%