INDEX
Explanations
phrases indicating deviation or departure from expected norms
critical evaluations or assessments of artistic works
New Auto-Interp
Negative Logits
illance
-0.82
mobilization
-0.78
mobil
-0.76
inaug
-0.75
kefeller
-0.73
arrangements
-0.71
mobilized
-0.70
amorph
-0.70
ère
-0.70
ipal
-0.69
POSITIVE LOGITS
Personally
1.25
Personally
1.25
Honestly
1.10
Honestly
1.07
Nope
0.92
Whilst
0.92
cringe
0.89
Regardless
0.87
Firstly
0.87
negativity
0.87
Activations Density 0.986%