INDEX
Explanations
names of people and locations
occurrences and variations of the word "of"
New Auto-Interp
Negative Logits
attribute
-0.80
disag
-0.78
kered
-0.76
parency
-0.74
*.
-0.72
TEXT
-0.69
ioned
-0.68
surrog
-0.64
âĨij
-0.63
void
-0.62
POSITIVE LOGITS
Pasadena
0.81
Auckland
0.80
Sicily
0.76
Ward
0.75
whom
0.74
Queens
0.74
Rochester
0.74
Emer
0.74
Elm
0.73
Antioch
0.73
Activations Density 0.089%