INDEX
Explanations
references to reality television shows and their dynamics
New Auto-Interp
Negative Logits
deductible
-0.15
pageTitle
-0.15
iber
-0.15
icense
-0.15
кÑĢÑĭ
-0.14
elib
-0.14
otti
-0.14
rawer
-0.14
ibold
-0.14
habit
-0.14
POSITIVE LOGITS
Survivor
0.19
log
0.16
elim
0.16
votes
0.16
/group
0.16
log
0.15
strateg
0.15
elimination
0.15
veget
0.15
ira
0.14
Activations Density 0.003%