INDEX
Explanations
references to exile and situations involving fleeing or escaping
fled to avoid
New Auto-Interp
Negative Logits
credit
-0.32
even
-0.31
sobres
-0.29
drivers
-0.29
pio
-0.29
испыты
-0.29
rrggbb
-0.29
pi
-0.29
struggling
-0.28
icordia
-0.28
POSITIVE LOGITS
featureID
0.57
surla
0.56
overseas
0.55
IVEREF
0.55
Мексичка
0.55
exiles
0.54
0.53
Overseas
0.52
abroad
0.52
autorytatywna
0.52
Activations Density 0.041%