INDEX
Explanations
phrases related to occurrences or events taking place
New Auto-Interp
Negative Logits
réaliste
-0.61
seamnă
-0.59
détru
-0.57
démocr
-0.57
colorés
-0.56
trouvez
-0.56
fört
-0.56
Escribe
-0.55
normaux
-0.54
détruit
-0.54
POSITIVE LOGITS
going
1.04
Going
1.03
going
0.96
taking
0.92
Taking
0.89
GOING
0.89
Going
0.87
GOING
0.86
making
0.85
Taking
0.83
Activations Density 0.131%