INDEX
Explanations
references to accusations and legal terminology
New Auto-Interp
Negative Logits
Hentet
-0.42
GEBURTS
-0.42
actionMode
-0.41
يميديا
-0.39
ictwo
-0.37
kaynağından
-0.37
strophe
-0.37
cohort
-0.35
naturen
-0.34
识
-0.33
POSITIVE LOGITS
accused
0.82
accuse
0.66
lamp
0.60
accuses
0.59
accu
0.57
lamp
0.56
Italijanski
0.52
menu
0.51
reply
0.51
lamps
0.51
Activations Density 0.189%