INDEX
Explanations
proper nouns, particularly names of people and possibly locations
New Auto-Interp
Negative Logits
resourceCulture
-0.80
الحياه
-0.71
autorytatywna
-0.70
CanadaChoose
-0.69
vulgaires
-0.65
GEBURTSDATUM
-0.65
препратки
-0.63
distanciation
-0.62
úgó
-0.62
kuuta
-0.61
POSITIVE LOGITS
<bos>
0.65
BatchNorm
0.58
McC
0.55
actionMode
0.54
Ko
0.53
cellpadding
0.53
MonoBehaviour
0.50
McC
0.50
Mer
0.50
Mc
0.49
Activations Density 0.312%