INDEX
Explanations
mentions of personal matters or opinions, especially related to health, politics, or lifestyle
references to personal information and experiences
New Auto-Interp
Negative Logits
xual
-1.17
UMP
-0.79
REG
-0.76
LER
-0.73
etting
-0.71
å¾
-0.71
LOS
-0.68
NESS
-0.68
Archdemon
-0.67
ÄŁ
-0.67
POSITIVE LOGITS
ised
1.28
ized
1.13
ization
1.10
isation
1.05
izing
1.03
izations
1.03
ities
1.02
belongings
1.02
isations
0.97
izes
0.97
Activations Density 0.056%