INDEX
Negative Logits
fre
0.67
,《
0.66
supervision
0.65
stimul
0.65
comen
0.64
-
0.64
,
0.63
bols
0.63
よい
0.61
itters
0.61
POSITIVE LOGITS
They
0.94
éton
0.92
Warning
0.89
与其
0.87
Their
0.87
When
0.86
Goodbye
0.86
Don
0.86
𝘞
0.86
Certain
0.86
Activations Density 0.194%