INDEX
Explanations
specific descriptors for individuals and their professions
New Auto-Interp
Negative Logits
abilit
-0.15
ána
-0.15
adores
-0.15
uti
-0.14
ži
-0.14
748
-0.14
Intr
-0.14
áte
-0.14
Inv
-0.13
fik
-0.13
POSITIVE LOGITS
polit
0.18
politik
0.17
jur
0.16
ped
0.16
paint
0.16
politician
0.16
painting
0.15
polit
0.15
_paint
0.15
ge
0.15
Activations Density 0.028%