INDEX
Explanations
mentions of the word "Obama."
instances of the string "ama"
New Auto-Interp
Negative Logits
ting
-0.90
ted
-0.86
tle
-0.78
points
-0.74
lip
-0.73
ried
-0.73
inav
-0.71
ther
-0.71
working
-0.70
sheet
-0.70
POSITIVE LOGITS
qua
0.99
ña
0.95
ñ
0.95
utra
0.94
ican
0.90
eus
0.87
ama
0.84
zzle
0.82
emn
0.82
é¾įåĸļ士
0.81
Activations Density 0.021%