INDEX
Explanations
possessive pronouns used in a context relating to individuals and their relationships
New Auto-Interp
Negative Logits
apore
-0.16
анг
-0.15
isci
-0.15
cảnh
-0.14
.second
-0.14
atted
-0.14
stral
-0.14
avers
-0.14
085
-0.14
erable
-0.13
POSITIVE LOGITS
troop
0.15
redi
0.15
ÏģÏī
0.14
correspond
0.14
audience
0.14
nech
0.14
Male
0.14
acceler
0.14
Higher
0.14
æŀļ
0.13
Activations Density 0.122%