INDEX
Explanations
following followed by list items
New Auto-Interp
Negative Logits
тут
0.43
here
0.42
the
0.41
här
0.39
Here
0.39
یہاں
0.39
这里
0.39
zde
0.38
यहां
0.38
peur
0.37
POSITIVE LOGITS
ៈ
0.55
👇
0.54
:(
0.54
:
0.53
maßen
0.52
いずれ
0.49
*:
0.47
criteria
0.46
प्रमाणे
0.44
criteria
0.43
Activations Density 0.009%