INDEX
Explanations
proper nouns, particularly names and places
New Auto-Interp
Negative Logits
aarrggbb
-0.91
ecap
-0.88
dafx
-0.86
fillment
-0.85
Magi
-0.83
Waller
-0.81
propOrder
-0.81
cytok
-0.81
Judea
-0.81
Karn
-0.81
POSITIVE LOGITS
Portail
0.81
Lou
0.80
Lou
0.77
OUG
0.77
Cou
0.77
"]
0.77
LOU
0.76
}]
0.71
lou
0.71
Portail
0.70
Activations Density 3.252%