INDEX
Explanations
political and economic control
New Auto-Interp
Negative Logits
Е
0.51
Α
0.50
Έ
0.47
odlič
0.46
레
0.46
혔
0.46
)||
0.44
ரா
0.44
ικού
0.44
পৃথিব
0.44
POSITIVE LOGITS
slag
0.53
emotion
0.53
old
0.52
financial
0.48
swe
0.47
spring
0.46
فقط
0.46
awareness
0.46
affected
0.45
絫
0.45
Activations Density 0.002%