INDEX
Negative Logits
であった
0.53
Were
0.50
学ぶ
0.49
comienzan
0.47
Required
0.45
即可
0.45
ってしまう
0.44
Were
0.43
となっている
0.43
しまう
0.43
POSITIVE LOGITS
heard
0.62
glimps
0.56
learned
0.54
spent
0.52
messed
0.51
ate
0.51
caught
0.50
did
0.50
bought
0.49
chased
0.48
Activations Density 0.145%