INDEX
Explanations
you or i actions and possessions
New Auto-Interp
Negative Logits
ровании
0.38
celebrado
0.38
дание
0.38
ையாள
0.37
மையான
0.37
গিয়েছিল
0.36
বলা
0.35
большого
0.34
WebApplication
0.34
Mini
0.34
POSITIVE LOGITS
mentioned
0.52
deems
0.46
admires
0.42
admire
0.42
mentions
0.42
chose
0.41
jupyter
0.41
পরে
0.41
despise
0.39
owns
0.39
Activations Density 0.016%