INDEX
Explanations
phrases that provide disclaimers or additional context
providing context or side notes
New Auto-Interp
Negative Logits
Pflichten
-0.44
gemens
-0.43
pouvoit
-0.42
avoient
-0.42
feroit
-0.42
kurtka
-0.41
betweenstory
-0.40
området
-0.40
quæ
-0.40
szczeg
-0.39
POSITIVE LOGITS
FWIW
0.72
FYI
0.68
FYI
0.66
brigens
0.66
Билгалдахарш
0.52
btw
0.52
лтамалар
0.48
一応
0.48
BTW
0.47
الرياضيه
0.46
Activations Density 0.023%