INDEX
Explanations
words or phrases related to governmental statements or decisions
phrases related to emotional or psychological conflicts
New Auto-Interp
Negative Logits
eatures
-0.78
behav
-0.74
proble
-0.74
agre
-0.73
srf
-0.68
ursday
-0.64
occas
-0.64
ukong
-0.63
enthusi
-0.63
chall
-0.63
POSITIVE LOGITS
ãĤĮ
0.83
âĸĢ
0.78
%"
0.78
ãģį
0.75
é¾į
0.72
μ
0.70
âĸĪâĸĪâĸĪâĸĪ
0.69
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
0.68
çļĦ
0.67
âĸij
0.66
Activations Density 0.271%