INDEX
Explanations
phrases indicating something observed or happening up to a certain point in time
phrases that indicate progression or distance in time
New Auto-Interp
Negative Logits
-+
-0.65
cause
-0.65
kefeller
-0.65
andal
-0.64
equal
-0.62
MpServer
-0.61
me
-0.61
imb
-0.60
inges
-0.59
SourceFile
-0.59
POSITIVE LOGITS
there
0.90
we
0.87
adays
0.78
however
0.76
nobody
0.75
they
0.72
though
0.69
everyone
0.66
everybody
0.65
it
0.63
Activations Density 0.113%