INDEX
Explanations
phrases that emphasize particular attributes or characteristics
New Auto-Interp
Negative Logits
Republic
-0.50
republic
-0.46
culture
-0.46
JI
-0.46
abandoned
-0.46
sulphuric
-0.46
altezza
-0.45
kaarangay
-0.45
ensee
-0.45
Civil
-0.45
POSITIVE LOGITS
especially
0.68
0.60
Sünde
0.57
cortesía
0.55
Wikiseite
0.55
Irán
0.55
especially
0.52
especialmente
0.50
christlichen
0.49
dientemente
0.49
Activations Density 0.242%