INDEX
Explanations
phrases indicating repeated patterns or occurrences
phrases indicating repetitive actions or occurrences
New Auto-Interp
Negative Logits
arling
-0.78
agar
-0.69
yt
-0.68
bda
-0.67
agra
-0.66
stood
-0.66
ey
-0.66
edu
-0.65
ajor
-0.64
ondon
-0.64
POSITIVE LOGITS
someone
1.07
somebody
1.05
soever
1.00
imaginable
0.99
possible
0.94
someone
0.92
else
0.87
conceivable
0.85
you
0.83
we
0.77
Activations Density 0.052%