INDEX
Explanations
phrases related to legal issues and statements
phrases related to societal issues and existential questions
New Auto-Interp
Negative Logits
âĢ
-1.03
''.
-0.94
Others
-0.87
.''
-0.83
.''.
-0.79
?ãĢį
-0.77
âĸł
-0.77
âĢ
-0.76
)\
-0.76
ÂŃ
-0.75
POSITIVE LOGITS
EVERY
1.05
HUGE
0.99
VERY
0.92
absolutely
0.84
ONLY
0.83
ALL
0.81
NEVER
0.80
ALWAYS
0.77
NOT
0.77
REALLY
0.73
Activations Density 2.113%