INDEX
Explanations
sentences or phrases that convey quotes or reported speech
New Auto-Interp
Negative Logits
ahat
-0.15
над
-0.15
ÙĦب
-0.15
asso
-0.15
ct
-0.14
elman
-0.14
alia
-0.14
inct
-0.14
clip
-0.14
WithOptions
-0.14
POSITIVE LOGITS
agus
0.19
ondo
0.16
ernen
0.16
šov
0.14
amet
0.14
зв
0.14
uras
0.14
uset
0.13
erton
0.13
Elite
0.13
Activations Density 0.044%