INDEX
Explanations
occurrences of the word "their" in various forms
New Auto-Interp
Negative Logits
/*
-0.70
Jefus
-0.61
MLLoader
-0.57
affari
-0.57
resourceCulture
-0.55
omial
-0.54
kasarigan
-0.54
Slf
-0.54
OGND
-0.53
ロウィン
-0.53
POSITIVE LOGITS
their
1.25
their
1.23
Their
1.19
mereka
1.04
Their
1.02
they
1.00
they
1.00
They
0.98
kanilang
0.97
THEIR
0.94
Activations Density 0.093%