INDEX
Explanations
the word "jen" or variations of family-related terminology
New Auto-Interp
Negative Logits
anim
-0.17
spill
-0.15
leak
-0.15
Dud
-0.15
held
-0.14
Mad
-0.14
s
-0.14
LE
-0.14
template
-0.14
hack
-0.14
POSITIVE LOGITS
abox
0.19
boro
0.16
voks
0.16
adamente
0.15
vre
0.15
ÃŃnÄĽ
0.15
resco
0.14
umlu
0.14
Bylo
0.14
leston
0.14
Activations Density 0.004%