INDEX
Explanations
mentions of people's previous occupations or roles
New Auto-Interp
Negative Logits
luck
-0.84
orsi
-0.75
edly
-0.73
Results
-0.72
eming
-0.70
SPONSORED
-0.70
ulence
-0.70
Reason
-0.68
reasonable
-0.67
helm
-0.67
POSITIVE LOGITS
midst
1.14
guise
0.96
Department
0.95
vicinity
0.94
Philippines
0.91
department
0.91
trenches
0.90
field
0.90
meantime
0.88
aftermath
0.86
Activations Density 0.182%