INDEX
Negative Logits
G
0.38
recall
0.37
द्वारा
0.37
ًا
0.37
Speaks
0.37
WE
0.36
penjelasan
0.36
prilikom
0.36
३
0.36
terdapat
0.35
POSITIVE LOGITS
slowly
0.50
используя
0.50
עם
0.49
twice
0.49
WITH
0.49
immediately
0.47
asap
0.47
sparingly
0.45
πρώτη
0.45
sweetened
0.44
Activations Density 0.009%