INDEX
Explanations
names of individuals
mentions of specific individuals associated with political contexts
New Auto-Interp
Negative Logits
perture
-0.80
perature
-0.76
agne
-0.76
ulkan
-0.76
aturation
-0.76
izophren
-0.71
apter
-0.70
anism
-0.69
iflower
-0.69
ropolitan
-0.68
POSITIVE LOGITS
bred
0.82
issance
0.76
weather
0.75
yssey
0.73
waters
0.71
serv
0.69
catch
0.67
icles
0.67
ĪĴ
0.67
assemblies
0.66
Activations Density 0.068%