INDEX
Explanations
quotations or reported speech
expressions of dialogue or quotations
New Auto-Interp
Negative Logits
prohib
-0.74
piled
-0.62
pole
-0.61
irtual
-0.60
habitable
-0.59
interpre
-0.59
enary
-0.59
nearest
-0.58
similarly
-0.58
Lau
-0.57
POSITIVE LOGITS
YES
0.88
congratulations
0.88
hey
0.88
Oh
0.86
bye
0.84
prest
0.81
HHHH
0.80
bye
0.80
sir
0.80
GOD
0.78
Activations Density 0.212%