INDEX
Explanations
references to powerful families or individuals associated with control or influence
New Auto-Interp
Negative Logits
ddelweddau
-0.81
__":
-0.80
oprot
-0.80
]))
-0.77
>=",
-0.76
resourceCulture
-0.75
endblock
-0.74
"):
-0.73
}))
-0.73
"];
-0.73
POSITIVE LOGITS
ята
0.49
/
0.48
/
0.48
~
0.47
伏
0.47
~
0.47
L
0.47
!
0.47
Z
0.45
O
0.45
Activations Density 0.147%