INDEX
Explanations
references to gatherings or assemblies of people
New Auto-Interp
Negative Logits
entanto
-0.65
Xiu
-0.62
uș
-0.59
xins
-0.59
ſay
-0.59
îns
-0.58
ostavi
-0.58
เกิน
-0.57
expliquer
-0.57
Varian
-0.57
POSITIVE LOGITS
Lovers
0.83
Lover
0.83
lovers
0.80
lovers
0.79
Lover
0.75
lover
0.72
crowd
0.71
lover
0.70
CROW
0.69
Trade
0.62
Activations Density 0.031%