INDEX
Explanations
dates, names, and locations typically seen in news articles
geographical locations and proper nouns
New Auto-Interp
Negative Logits
Reincarn
-0.82
Paras
-0.78
Dalai
-0.72
BILITIES
-0.67
interrog
-0.67
propag
-0.67
Recep
-0.65
ransom
-0.65
Reloaded
-0.64
人
-0.64
POSITIVE LOGITS
esville
1.17
erville
1.17
mington
1.16
mingham
1.05
heny
1.04
neapolis
1.04
ville
1.02
brook
1.01
lington
1.00
town
1.00
Activations Density 0.335%