INDEX
Explanations
adjusting or fiddling with something
New Auto-Interp
Negative Logits
التي
1.39
من
1.39
في
1.38
Approximately
1.36
بشكل
1.32
این
1.26
specified
1.24
Category
1.22
Additionally
1.21
Gambar
1.21
POSITIVE LOGITS
honest
0.86
nice
0.80
sorry
0.78
lots
0.77
俺
0.70
fed
0.70
sold
0.70
plan
0.69
success
0.68
yeah
0.68
Activations Density 0.015%