INDEX
Explanations
words related to authority, power, and control
terms related to threats, accusations, and declarations of significant importance
New Auto-Interp
Negative Logits
ðŁij
-0.69
+++
-0.65
âĺħâĺħ
-0.65
additions
-0.62
sites
-0.60
++++++++++++++++
-0.59
aughs
-0.59
>>>>
-0.58
issance
-0.58
--+
-0.56
POSITIVE LOGITS
thereof
1.00
deemed
0.97
favourable
0.94
inciting
0.90
hostile
0.87
resembling
0.87
forbidden
0.83
belonging
0.83
detrimental
0.82
of
0.81
Activations Density 0.534%