INDEX
Explanations
abstract concepts or states
New Auto-Interp
Negative Logits
quello
1.27
mga
1.17
Mga
1.15
qualche
1.12
stuff
1.11
información
1.10
информация
1.10
informasjon
1.10
those
1.09
sposób
1.08
POSITIVE LOGITS
kinds
1.29
two
1.19
mesmos
1.16
guys
1.14
same
1.13
days
1.08
three
1.04
four
0.99
insights
0.99
mismos
0.98
Activations Density 0.134%