INDEX
Explanations
mentions of people in various contexts and their actions or opinions
New Auto-Interp
Negative Logits
erner
-0.17
duk
-0.14
agina
-0.14
-0.14
oreach
-0.14
à¹Ģà¸Ńà¸ĩ
-0.14
floppy
-0.14
ène
-0.14
ække
-0.13
okud
-0.13
POSITIVE LOGITS
died
0.17
flock
0.17
talk
0.16
dying
0.15
dies
0.15
isel
0.15
kee
0.14
oho
0.14
CARE
0.14
RC
0.14
Activations Density 0.188%