INDEX
Explanations
names of individuals involved in various news stories
names of notable individuals
New Auto-Interp
Negative Logits
Maria
-0.77
į
-0.73
ference
-0.67
Els
-0.67
uate
-0.65
LR
-0.63
cp
-0.62
cknowled
-0.62
Laura
-0.61
omatic
-0.60
POSITIVE LOGITS
Sr
0.89
ovich
0.81
Jr
0.79
enegger
0.78
III
0.74
QC
0.72
campaigned
0.69
imperson
0.69
agher
0.68
uttered
0.66
Activations Density 0.132%