INDEX
Explanations
capital letters
letters and specific characters in the text
New Auto-Interp
Negative Logits
debit
-0.67
çͰ
-0.66
00007
-0.64
Scion
-0.61
Mub
-0.60
elig
-0.59
éĹĺ
-0.58
Hurricanes
-0.57
Totem
-0.56
Chiefs
-0.55
POSITIVE LOGITS
ilus
0.91
urnal
0.83
obic
0.82
otropic
0.80
aceae
0.79
Henry
0.74
igen
0.72
ical
0.72
alf
0.71
oric
0.71
Activations Density 0.061%