INDEX
Explanations
pronouns related to personal experience and identity
New Auto-Interp
Negative Logits
pozw
-0.44
ล้ว
-0.37
ทอง
-0.37
BIAS
-0.37
gekomen
-0.37
Zähne
-0.37
kesulitan
-0.36
niversitesi
-0.36
richting
-0.35
ilmo
-0.35
POSITIVE LOGITS
initComponents
0.73
Panamoan
0.59
thiệu
0.54
Skocz
0.53
BaseModel
0.53
="@+
0.52
فريبيس
0.50
:✨
0.50
propOrder
0.49
ніципалі
0.49
Activations Density 0.006%