INDEX
Explanations
proper nouns related to organizations, locations, or individuals
mentions of various sources or references
New Auto-Interp
Negative Logits
mere
-0.74
女
-0.73
coins
-0.70
ãĤ¦ãĤ¹
-0.69
alpha
-0.69
ben
-0.69
tes
-0.67
ratulations
-0.65
TEXT
-0.65
opsis
-0.65
POSITIVE LOGITS
various
0.92
departments
0.90
organisations
0.87
both
0.86
agencies
0.83
universities
0.81
Various
0.80
disparate
0.80
academia
0.79
organizations
0.77
Activations Density 0.197%