INDEX
Explanations
frequency of phrases indicating numerical ranking or assessments
New Auto-Interp
Negative Logits
ynes
-0.67
wa
-0.66
quel
-0.64
iko
-0.64
Reviewer
-0.63
iture
-0.62
WARE
-0.62
nod
-0.61
aeus
-0.61
Bought
-0.59
POSITIVE LOGITS
civilization
0.79
humanity
0.74
humankind
0.73
civilisation
0.71
mankind
0.69
hostilities
0.68
sorts
0.68
our
0.68
what
0.67
adulthood
0.67
Activations Density 0.102%