INDEX
Explanations
names of individuals, particularly with 'ala' or 'br' in the name
words related to the names or identifiers of individuals and groups
New Auto-Interp
Negative Logits
vez
-0.73
posted
-0.70
taboola
-0.67
©¶æ
-0.63
hindsight
-0.61
DN
-0.60
disple
-0.59
polit
-0.59
wordpress
-0.59
dated
-0.58
POSITIVE LOGITS
uku
0.79
zees
0.72
opol
0.70
abwe
0.70
ixture
0.68
apolis
0.66
dor
0.65
enos
0.64
olitan
0.64
inki
0.64
Activations Density 0.193%