INDEX
Explanations
names or references to individuals and their actions
New Auto-Interp
Negative Logits
Trave
-0.67
Daly
-0.66
ishable
-0.64
Brisbane
-0.59
Millennials
-0.57
Berry
-0.57
Sovereign
-0.56
Fla
-0.56
cientious
-0.56
Burg
-0.56
POSITIVE LOGITS
consisted
0.88
belonged
0.76
reportedly
0.76
anyahu
0.76
was
0.76
testified
0.76
specializes
0.74
wasn
0.74
alone
0.73
consists
0.73
Activations Density 2.348%