INDEX
Explanations
references to "other" entities or categories, particularly related to evidence or groups of items
New Auto-Interp
Negative Logits
másik
-0.41
المعيارى
-0.41
PhysRevD
-0.40
الحره
-0.40
diğer
-0.39
IntoConstraints
-0.37
itſelf
-0.37
inaczej
-0.36
︎
-0.36
other
-0.35
POSITIVE LOGITS
worldly
1.19
than
0.80
niż
0.71
THAN
0.63
decât
0.59
similarly
0.57
kuin
0.57
similar
0.56
kinds
0.56
world
0.56
Activations Density 0.239%