INDEX
Explanations
references to cultural activities and community engagement
New Auto-Interp
Negative Logits
indsight
-0.16
avax
-0.15
/epl
-0.14
Ú¯ÙĦ
-0.14
ÑĦÑĥн
-0.14
ailability
-0.14
ipop
-0.13
жа
-0.13
lesbienne
-0.13
irim
-0.13
POSITIVE LOGITS
roles
0.39
role
0.36
enact
0.36
acting
0.35
enactment
0.35
actors
0.34
actor
0.33
Role
0.32
acted
0.31
act
0.31
Activations Density 0.188%