INDEX
Explanations
references to experiences of hardship or injury
New Auto-Interp
Negative Logits
antz
-0.17
aldo
-0.17
elden
-0.16
Locker
-0.16
forg
-0.15
аний
-0.15
alf
-0.14
gard
-0.14
ayar
-0.14
erse
-0.14
POSITIVE LOGITS
chances
0.23
odds
0.19
çļĦè¯Ŀ
0.16
ilio
0.15
acci
0.14
383
0.14
thì
0.14
chai
0.13
ilo
0.13
Odds
0.13
Activations Density 0.053%