INDEX
Explanations
explaining or clarifying concepts
New Auto-Interp
Negative Logits
besides
0.49
auch
0.41
ാനും
0.40
ствии
0.39
firearms
0.39
fared
0.39
oltre
0.39
також
0.39
neben
0.38
Neben
0.38
POSITIVE LOGITS
也就是说
1.44
つまり
1.41
Essentially
1.39
Basically
1.34
Essentially
1.27
Basically
1.16
অর্থাৎ
1.13
অর্থাৎ
1.12
Imagine
1.12
essentially
1.11
Activations Density 0.070%