INDEX
Explanations
relevant terms denoting a personal or political nature
terms related to personal and political experiences or themes
New Auto-Interp
Negative Logits
osphere
-0.70
ioxide
-0.68
otaur
-0.67
bey
-0.67
ournal
-0.66
irit
-0.65
iage
-0.65
secution
-0.65
mosp
-0.65
Appears
-0.64
POSITIVE LOGITS
enough
1.10
arily
0.97
isable
0.88
istically
0.87
aneously
0.82
izable
0.80
nell
0.78
nonetheless
0.76
itably
0.73
corrid
0.70
Activations Density 0.408%