INDEX
Explanations
typical examples or case studies
New Auto-Interp
Negative Logits
goodies
1.07
cosas
1.02
muchas
1.01
Goodbye
1.00
persoon
0.98
eszcze
0.98
tante
0.96
dinero
0.96
jeszcze
0.95
mensen
0.95
POSITIVE LOGITS
典型
1.22
selected
1.22
typical
1.14
urban
1.14
sampled
1.14
study
1.13
prototypical
1.10
typical
1.10
типи
1.09
exemplary
1.04
Activations Density 0.084%