INDEX
Explanations
instances of the word "Another" followed by a continuation or list of items
phrases that introduce new points or arguments
New Auto-Interp
Negative Logits
hips
-0.82
ouls
-0.82
onies
-0.72
ivas
-0.70
olas
-0.69
alties
-0.68
obar
-0.67
riages
-0.67
present
-0.63
amy
-0.62
POSITIVE LOGITS
worldly
0.94
aspect
0.94
notable
0.86
drawback
0.86
example
0.86
pecul
0.85
notch
0.84
factor
0.82
thing
0.81
unnamed
0.80
Activations Density 0.030%