INDEX
Explanations
phrases related to specific actions or events happening
isolated clauses that provide context or details in a sentence
New Auto-Interp
Negative Logits
ety
-0.70
¬¼
-0.70
raved
-0.69
iple
-0.64
rite
-0.63
rio
-0.63
ITE
-0.61
ocl
-0.61
scription
-0.59
Founding
-0.59
POSITIVE LOGITS
whereas
1.13
thereby
1.09
respectively
1.06
regardless
1.03
preferably
0.97
resulting
0.95
namely
0.94
etc
0.94
although
0.92
depending
0.92
Activations Density 0.810%