INDEX
Explanations
block start or end markers in text
New Auto-Interp
Negative Logits
DIPSETTING
-0.58
posób
-0.53
ütfen
-0.53
WriteBarrier
-0.51
ärna
-0.50
świę
-0.50
extAlignment
-0.50
ึ้น
-0.50
httphttps
-0.50
keinem
-0.49
POSITIVE LOGITS
According
0.87
Speaking
0.82
Speaking
0.81
According
0.80
Meanwhile
0.79
Meanwhile
0.73
Commenting
0.70
Selon
0.67
according
0.66
Rea
0.66
Activations Density 0.026%