INDEX
Explanations
specific language identifiers or classifiers associated with cultural contexts
New Auto-Interp
Negative Logits
20439
-0.72
favor
-0.71
favour
-0.66
juggling
-0.65
accompl
-0.64
cloth
-0.63
Franch
-0.63
gag
-0.62
itaire
-0.62
Pathfinder
-0.61
POSITIVE LOGITS
Ŀ
1.27
Ĩ
1.18
¹
1.14
ķ
1.11
ĭ
1.08
¶
1.07
Ĥ¬
1.05
§
1.03
ĺ
1.00
¦
0.99
Activations Density 0.002%