INDEX
Explanations
phrases that summarize or evaluate the subjects positively
overall assessment
New Auto-Interp
Negative Logits
ⓧ
-0.76
LLocation
-0.67
хьтан
-0.67
يتيمه
-0.66
***!
-0.63
الإنجليزية
-0.62
Italijani
-0.62
queſta
-0.62
wireType
-0.62
rungsseite
-0.62
POSITIVE LOGITS
Overall
0.90
overall
0.90
Overall
0.87
overall
0.78
总体
0.63
OVERALL
0.61
整體
0.60
整体
0.54
Insgesamt
0.52
keseluruhan
0.50
Activations Density 0.006%