INDEX
Explanations
elements related to individuals and their actions within social and legal contexts
New Auto-Interp
Negative Logits
497
-0.16
æ©
-0.16
arme
-0.16
alte
-0.15
ë
-0.14
pract
-0.14
ips
-0.14
DS
-0.14
ÃŃ
-0.14
Chu
-0.13
POSITIVE LOGITS
stesso
0.16
ameda
0.15
artz
0.15
icens
0.14
LOBAL
0.14
phant
0.14
åIJ¸
0.14
odega
0.14
inx
0.14
iangle
0.14
Activations Density 0.144%