INDEX
Explanations
names of individuals and references to their actions or roles
New Auto-Interp
Negative Logits
ailability
-0.66
citiz
-0.59
conflic
-0.58
icable
-0.58
BIL
-0.57
nah
-0.56
Seym
-0.56
Kardash
-0.55
Celsius
-0.55
etheless
-0.54
POSITIVE LOGITS
zzi
1.00
ieri
0.88
igham
0.84
Jr
0.78
's
0.73
xus
0.72
Presents
0.70
zzo
0.69
inelli
0.69
Sr
0.69
Activations Density 0.137%