INDEX
Explanations
phrases indicating ongoing possession or state of being
New Auto-Interp
Negative Logits
вад
-0.14
weise
-0.14
erosis
-0.14
acci
-0.13
uffy
-0.13
è²Į
-0.13
âh
-0.13
wis
-0.13
needed
-0.13
esi
-0.12
POSITIVE LOGITS
got
0.27
lot
0.21
got
0.21
chances
0.21
Got
0.20
resemblance
0.19
Got
0.19
always
0.18
nothing
0.17
its
0.17
Activations Density 0.196%