INDEX
Explanations
descriptive adjectives and nouns
New Auto-Interp
Negative Logits
किसी
0.46
apabila
0.46
Essa
0.45
ուն
0.44
بعض
0.42
Su
0.42
عندها
0.41
بعض
0.41
una
0.41
una
0.41
POSITIVE LOGITS
Abs
0.54
Demonstr
0.48
Demonstration
0.47
Hypot
0.43
Traum
0.43
леко
0.42
Asc
0.42
Demonstrate
0.42
Elimin
0.41
Dist
0.40
Activations Density 0.008%