INDEX
Explanations
occurrences of first-person references or personal pronouns
New Auto-Interp
Negative Logits
Whe
-0.15
IPA
-0.15
екÑĤоÑĢ
-0.15
rowNum
-0.14
çµ
-0.14
öld
-0.14
à¥ĭह
-0.13
Fro
-0.13
displ
-0.13
ois
-0.13
POSITIVE LOGITS
.leading
0.17
Leading
0.15
leading
0.15
andler
0.15
hle
0.15
ogan
0.14
_MUT
0.14
ABLE
0.14
inline
0.14
amba
0.14
Activations Density 0.017%