INDEX
Explanations
novel and eclectic expressions in various contexts
articles and nouns
New Auto-Interp
Negative Logits
üedad
-0.39
δύ
-0.38
either
-0.36
untung
-0.35
final
-0.35
eccell
-0.35
Small
-0.34
Either
-0.33
small
-0.32
hombro
-0.32
POSITIVE LOGITS
kasarigan
0.78
########.
0.74
chimpanze
0.72
dinosaur
0.71
underwater
0.71
なんと
0.70
underwater
0.70
bikini
0.70
penguins
0.69
bikinis
0.69
Activations Density 0.215%